A large, un-preprocessed, multi-relational and partially documented database extract. This data is intended for use in research on pre-processing techniques for real world data.

"The KDD-Sisyphus Workgroup provides the Sisyphus I package which is based on data extracted from a real-world insurance business application. As such it shows typical properties like fragmentation, varying data quality, irregular data value codings, etc. which makes the application of data mining or machine learning algorithms a real challenge and usually requires sophisticated preprocessing methods."

The data was previously available at http://research.swisslife.ch/kdd-sisyphus/ but no current source for the data is known.
Topic revision: r3 - 03 Nov 2005 - 13:33:09 - Andy Pryke

