[LangExtract](https://github.com/google/langextract) is a new (released literally today) library for extracting structured information from text. See [medical example](https://github.com/google/langextract/blob/main/docs/examples/medication_examples.md). However, it doesn't parse PDFs.
Named Entity Linking (disambiguation, normalization) in spacy:
- https://youtu.be/8u57WSXVpmw?si=9kM4SPsIUsgnCP7Z
- https://youtu.be/JIz-hiRrZ2g?si=eFSIvfN5hJAVUw0P