Apache DRAT mission statement

What does DRAT stand for?

Distributed Release Audit Tool - based on the shoulders of Apache Creadur's Release Audit Tool (RAT) this project tries to scale out license checks on a large scale.

What does DRAT want?

The Distributed Release Audit Tool (DRAT) improves over the Apache RAT code audit tool in several ways.
RAT is a command line tool and Java API and Maven plugin that audits a code base and its declared OSS licenses - if you say it's Apache2, RAT will check whether or not your source is Apache2 and produce a report that states what files are/aren't and why. RAT has several problems, namely:

  • It doesn't scale to large code bases - running it on a 25k file and 10M LOC code base ran for ~4 weeks on a normal Linux server with 5GB memory and tons of hard disk and modern CPUs.
  • RAT's crawler is rudimentary and you have to use explicit white/black lists on what files to avoid or else it will be checking binary files for licenses.
  • RAT doesn't produce incremental output. It either completes and generates a log, or it doesn't.
DRAT improves upon RAT in several ways namely by addressing all of the above concerns.
DRAT is a Map Reduce version of RAT using Apache Tika to automatically sort and classify the code base files; Apache OODT to index metadata and Tika information about those code files into Apache Solr; and OODT to produce a Map Reduce workflow that runs RAT incrementally on k-sized chunks of same-MIME-typed files (detected by Tika) and then producing incremental, per type logs, and then aggregating and reducing them into a combined log at the end.

What's the status of the project?

As of September 2017 the project was granted top-level status after being developed for a while on GitHub.

Do you want to contribute?

You can find our GitHub repository at: https://github.com/apache/drat/
Our mailing list is dev@drat.apache.org