Hi Everyone,

We have a few use cases where the organizations intend to upload their organizational documents in digital libraries. Because manual cataloguing will require significant resources, we proposed an LLM based approach.

Using the LLM-based solution, we would first convert the images of the relevant pages, if the pages are not text format already and then ingest the texts into LLM. Relevant pages from the PDF document will then be ingest into GPT (primarily OpenAI) to provide the digital library Editor with auto-suggested document index (catalog data), summaries for abstract, and keywords. The index data is then posted into digital library’s catalog API as draft catalog for the Editor to review and edit.

We would be interested in hearing experiences from the community and level of accuracy you are getting in your implementation.

Best regards,

Atanu

Atanu Garai
SocialWell
w: www.socialwell.net

LinkedIn | Twitter