```bash * *-- src/ *-- models/ *-- __init__.py *-- notebooks/ # for experiments, prototyping *-- output/ # outputs *-- scripts/ # scripts for running pipelines *-- tests/ # pytest unit tests *-- config/ # env, secrets, configs *-- .gitignore *-- .env *-- pyproject.toml # or requirements.txt, environment.yml *-- README.md ``` https://guicommits.com/organize-python-code-like-a-pro/ https://packaging.python.org/en/latest/tutorials/packaging-projects/ - **`src/`**: your API. - **`scripts/`**: keeps runnable scripts clean, each with one job (build the index, query, etc.). - **`output/`**: keeps MinerU artifacts and processed files out of source control but in a predictable place. - **`tests/`**: lets you write unit tests for `CustomParser.parse_document` so you can check text, table, and image parsing independently - **`notebooks/`**: for exploration, without mixing in production code. - **`config/`**: keeps environment variables, connection settings, and API keys separate.