ChemDataExtractor

The premium Open Source alternative to Elsevier Reaxys

🎯 Best for:Building custom chemical property databases from published literature.

What is ChemDataExtractor?

An open-source alternative to manual data curation services like CAS SciFinder. It employs natural language processing and chemical named entity recognition to extract properties and structures from PDF and HTML documents.

Tech Stack
PythonChemistry & Materials

Why ChemDataExtractor?

  • Multi-format support (PDF/HTML)
  • Built-in chemical dictionary
  • Extensible extraction rules

Limitations

  • Requires high-quality OCR for PDFs
  • Complex dependency tree
  • Occasional false positives in NER
2/25/2026
Last Update
121
Forks
23
Issues
MIT
License
Financial Leak Detected

Stop the "SaaS Tax"

Your team could be burning cash. Switching to ChemDataExtractor instantly boosts your runway.

Competitor Cost
-$1,440
/ year (est. based on Elsevier Reaxys)
Self-Hosted
$0
/ year
Team Size10 Users
150+
SAVE 100%

Community Discussion

Comments