Molecular search in heterogeneous documents

Company’s challenge

The department “New Product Program” of a pharma company needs to deals with a permanent increase of data to be annotated & managed with less human resources available in the team. As a consequence, the risk is to stop to perform the annotation and to lose the wealth of the scientific data.

This information is found in a heterogeneous set of documents :

  • Unstructured – reports, images, analyses, …
  • Of all types – pdf, image, office, csv, …


  • Inquiro with chemistry module deployed and configure in 3 weeks
  • More than 500 000 documents indexed at the initial load (including images and scanned documents)
  • 3 data sources connected (File share, CMS, in house application)
  • Private and public thésaurus used for annotation (9 chemical thesaurus and 11 galenic thésaurus )


Thanks to the mix of public public and private ontologies managed in Inquiro, the Automatic Annotation allows doing the same job with 60% fewer curators

Searches are 30x faster (“1 search in 10min versus 2-3 searches per day on Share drive” )

The “full text” search is efficient and relevant for all document formats (scanned documents -OCR-, images, …)

The search based on chemical structures and free text is possible on all indexed documents.

Client Testimonial

Researches are very fast and et very efficient. Now, we are able to find one information in 10 minutes against about 4 hours before Inquiro

Users are impressed !

Head of Scientific and Regulatory Documentation

Pharmaceutical Industry