A large, un-preprocessed, multi-relational and partially documented database extract. This data is intended for use in research on
pre-processing techniques for real world data.
"The KDD-Sisyphus Workgroup provides the Sisyphus I package which is based on data extracted from a real-world insurance business application. As such it shows typical properties like fragmentation, varying data quality, irregular data value codings, etc. which makes the application of data mining or machine learning algorithms a real
challenge and usually requires sophisticated preprocessing methods."
The data was previously available at
http://research.swisslife.ch/kdd-sisyphus/
but no current source for the data is known.