Hi Everyone,
We have a few use cases where the organizations intend to upload their organizational documents in digital libraries. Because manual cataloguing will require significant resources, we proposed an LLM based approach.
Using the LLM-based solution, we would first convert the images of the relevant pages, if the pages are not text format already and then ingest the texts into LLM. Relevant pages from the PDF document will then be ingest
into GPT (primarily OpenAI) to provide the digital library Editor with auto-suggested document index (catalog data), summaries for abstract, and keywords. The index data is then posted into digital library’s catalog API as draft catalog for the Editor to review
and edit.
We would be interested in hearing experiences from the community and level of accuracy you are getting in your implementation.
Best regards,
Atanu
Atanu Garai
SocialWell
w:
www.socialwell.net