[LangExtract](https://github.com/google/langextract) is a new (released literally today) library for extracting structured information from text. See [medical example](https://github.com/google/langextract/blob/main/docs/examples/medication_examples.md). However, it doesn't parse PDFs.