DataProfiler

The premium Open Source alternative to Informatica

🎯 Best for:Automating the discovery of sensitive information in massive, unstructured datasets.
Visit WebsiteCompare with Informatica
1.5k
Stars
Apache-2.0License

What is DataProfiler?

A technical alternative to Great Expectations that uses machine learning to automatically identify sensitive data and statistical distributions. It generates comprehensive reports on dataset health and PII presence without manual tagging.

Tech Stack
PythonAI, ML & Data

Why DataProfiler?

  • High-performance profiling
  • Built-in PII detection
  • Flexible API

Limitations

  • High memory usage
  • Python dependency
  • Model training overhead
3/5/2026
Last Update
185
Forks
82
Issues
Apache-2.0
License
Financial Leak Detected

Stop the "SaaS Tax"

Your team could be burning cash. Switching to DataProfiler instantly boosts your runway.

Competitor Cost
-$1,440
/ year (est. based on Informatica)
Self-Hosted
$0
/ year
Team Size10 Users
150+
SAVE 100%

Community Discussion

Comments