We can't find the internet
Attempting to reconnect
Something went wrong!
Hang in there while we get back on track
Discrimination is Wrong: Improving Productivity • Edward Kmett • YOW! 2015
Learn how discrimination sorts data in O(n) time instead of O(n log n), using discriminators and category theory concepts to boost productivity in operations like sorting and joins.
-
Discrimination is a technique that allows sorting and grouping data in linear O(n) time, breaking the typical O(n log n) sorting bound by not doing pairwise comparisons
-
Key benefits include building data structures like Map and Set in linear time instead of O(n log n), and performing fast table joins and grouping operations
-
The technique uses “discriminators” that break data into equivalence classes based on their keys, rather than comparing elements pairwise
-
American Flag Sort and Radix Sort are implementations of discrimination techniques that work top-down and bottom-up respectively to achieve linear time sorting
-
The concept builds on category theory ideas like monoids, functors, and natural transformations to create a robust theoretical foundation
-
The implementation uses unsafe IO operations and custom foreign prims in Haskell to achieve the required performance characteristics
-
Law requirements include maintaining stability (preserving original order within equivalence classes) and proper handling of sums/products in algebraic data types
-
Provides potential for streaming/productive implementations where results can start being produced before seeing entire input
-
Allows operations like
nub
(removing duplicates) to work in O(n) time instead of O(n²) -
Originally developed by Fritz Henglein at the University of Copenhagen through several papers between 2007-2011