```bash
*
*-- src/
*-- models/
*-- __init__.py
*-- notebooks/ # for experiments, prototyping
*-- output/ # outputs
*-- scripts/ # scripts for running pipelines
*-- tests/ # pytest unit tests
*-- config/ # env, secrets, configs
*-- .gitignore
*-- .env
*-- pyproject.toml # or requirements.txt, environment.yml
*-- README.md
```
https://guicommits.com/organize-python-code-like-a-pro/
https://packaging.python.org/en/latest/tutorials/packaging-projects/
- **`src/`**: your API.
- **`scripts/`**: keeps runnable scripts clean, each with one job (build the index, query, etc.).
- **`output/`**: keeps MinerU artifacts and processed files out of source control but in a predictable place.
- **`tests/`**: lets you write unit tests for `CustomParser.parse_document` so you can check text, table, and image parsing independently
- **`notebooks/`**: for exploration, without mixing in production code.
- **`config/`**: keeps environment variables, connection settings, and API keys separate.