ChemDataExtractor
The premium Open Source alternative to Elsevier Reaxys
🎯 Best for:Building custom chemical property databases from published literature.
What is ChemDataExtractor?
An open-source alternative to manual data curation services like CAS SciFinder. It employs natural language processing and chemical named entity recognition to extract properties and structures from PDF and HTML documents.
Tech Stack
PythonChemistry & Materials
Why ChemDataExtractor?
- • Multi-format support (PDF/HTML)
- • Built-in chemical dictionary
- • Extensible extraction rules
Limitations
- • Requires high-quality OCR for PDFs
- • Complex dependency tree
- • Occasional false positives in NER
2/25/2026
Last Update
121
Forks
23
Issues
MIT
License
Financial Leak Detected
Stop the "SaaS Tax"
Your team could be burning cash. Switching to ChemDataExtractor instantly boosts your runway.
Competitor Cost
-$1,440
/ year (est. based on Elsevier Reaxys)
Self-Hosted
$0
/ year
Team Size10 Users
150+
SAVE 100%