-- Donnacha Oisín Kidney ABOUT CONTACT FEED SNIPPETS
-- ================================================================================
-- ┌──────────────────────────────────────────────────────────────────────────────┐
-- │ oooooooooooo o8o │
-- │ `888' `8 `"' │
-- │ 888 oooo ooo. .oo. .oooooooo .ooooo. oooo d8b │
-- │ 888oooo8 `888 `888P"Y88b 888' `88b d88' `88b `888""8P │
-- │ 888 " 888 888 888 888 888 888ooo888 888 │
-- │ 888 888 888 888 `88bod8P' 888 .o 888 │
-- │ o888o o888o o888o o888o `8oooooo. `Y8bod8P' d888b │
-- │ d" YD │
-- │ "Y88888P' │
-- │ │
-- │ ooooooooooooo ooooo │
-- │ 8' 888 `8 `888' │
-- │ 888 oooo d8b .ooooo. .ooooo. .oooo.o 888 ooo. .oo. │
-- │ 888 `888""8P d88' `88b d88' `88b d88( "8 888 `888P"Y88b │
-- │ 888 888 888ooo888 888ooo888 `"Y88b. 888 888 888 │
-- │ 888 888 888 .o 888 .o o. )88b 888 888 888 │
-- │ o888o d888b `Y8bod8P' `Y8bod8P' 8""888P' o888o o888o o888o │
-- │ │
-- │ │
-- │ │
-- │ .o. .o8 │
-- │ .888. "888 │
-- │ .8"888. .oooooooo .oooo888 .oooo. │
-- │ .8' `888. 888' `88b d88' `888 `P )88b │
-- │ .88ooo8888. 888 888 888 888 .oP"888 │
-- │ .8' `888. `88bod8P' 888 888 d8( 888 │
-- │ o88o o8888o `8oooooo. `Y8bod88P" `Y888""8o │
-- │ d" YD │
-- │ "Y88888P' │
-- └──────────────────────────────────────────────────────────────────────────────┘
-- Posted on February 25, 2019
open import Data.FingerTree.Preamble
module Data.FingerTree
{c m a}
(imports : Imports c m a)
where
open Imports imports
import Data.FingerTree.Measures
module Measures = Data.FingerTree.Measures ℳ
-- As I have talked about previously, a large class of divide-and conquer
-- algorithms rely on "good" partitioning for the divide step. If you then want to
-- make the algorithms incremental, you keep all of those partitions (with their
-- summaries) in some "good" arrangement [1]. Several common data structures are
-- designed around this principle: binomial heaps, for instance, store partitions
-- of size 2ⁿ. Different ways of storing partitions favours different use cases:
-- switch from a binomial heap to a skew binomial, for instance, and you get
-- constant-time cons.
-- The standout data structure in this area is Hinze and Paterson's finger tree
-- [2]. It caches summaries in a pretty amazing way, allowing for (amortised) 𝒪(1)
-- cons and snoc and 𝒪(log n) split and append. These features allow it to be used
-- for a huge variety of things: Data.Sequence uses it as a random-access sequence,
-- but it can also work as a priority queue, a search tree, a priority search tree
-- [3], an interval tree, an order statistic tree...
-- All of these applications solely rely on an underlying monoid. As a result, I
-- thought it would be a great data structure to implement in Agda, so that you'd
-- get all of the other data structures with minimal effort [similar thinking
-- motivated a Coq implementation; 4]
module _ where
open Measures
-- ┌──────────────────────────────────────────────────────────────────────────────┐
-- │ │
-- │ ____ __ │
-- │ / ___| ___ ___ _ __ ___ ___ / _| │
-- │ \___ \ / __/ _ \| '_ \ / _ \ / _ \| |_ │
-- │ ___) | (_| (_) | |_) | __/ | (_) | _| │
-- │ |____/ \___\___/| .__/ \___| \___/|_| │
-- │ |_| │
-- │ __ __ _ __ _ _ _ │
-- │ \ \ / /__ _ __(_)/ _(_) ___ __ _| |_(_) ___ _ __ │
-- │ \ \ / / _ \ '__| | |_| |/ __/ _` | __| |/ _ \| '_ \ │
-- │ \ V / __/ | | | _| | (_| (_| | |_| | (_) | | | | │
-- │ \_/ \___|_| |_|_| |_|\___\__,_|\__|_|\___/|_| |_| │
-- │ │
-- └──────────────────────────────────────────────────────────────────────────────┘
-- There would be no real point to implementing a finger tree in Agda if we didn't
-- also prove some things about it. The scope of the proofs I've done so far are
-- intrinsic proofs of the summaries in the tree. In other words, the type of
-- cons is as follows:
_ : (x : A) → (xs : Tree A) → μ⟨ Tree A ⟩≈ (μ x ∙ μ xs)
_ = _◂_
-- This is enough to prove things about the derived data structures (like the
-- correctness of sorting if it's used as a priority queue), but it's worth
-- pointing out what I *haven't* proved (yet):
--
-- 1. Invariants on the structure ("safe" and "unsafe" digits and so on).
-- 2. The time complexity or performance of any operations.
--
-- To be honest, I'm not even sure that my current implementation is correct in
-- these regards! I'll probably have a go at proving them in the future [possibly
-- using 5].
-- ┌──────────────────────────────────────────────────────────────────────────────┐
-- │ │
-- │ __ __ _ _ _ │
-- │ | \/ | ___ _ __ ___ (_) __| |___ __ _ _ __ __| | │
-- │ | |\/| |/ _ \| '_ \ / _ \| |/ _` / __| / _` | '_ \ / _` | │
-- │ | | | | (_) | | | | (_) | | (_| \__ \ | (_| | | | | (_| | │
-- │ |_| |_|\___/|_| |_|\___/|_|\__,_|___/ \__,_|_| |_|\__,_| │
-- │ │
-- │ ____ __ │
-- │ | _ \ _ __ ___ ___ / _|___ │
-- │ | |_) | '__/ _ \ / _ \| |_/ __| │
-- │ | __/| | | (_) | (_) | _\__ \ │
-- │ |_| |_| \___/ \___/|_| |___/ │
-- │ │
-- └──────────────────────────────────────────────────────────────────────────────┘
-- The bad news is that finger trees are a relatively complex data structure, and
-- we're going to need a *lot* of proofs to write a verified version. The good news
-- is that monoids (in contrast to rings) are extremely easy to prove
-- automatically. In this project, I used reflection to do so, but I think it
-- should be possible to do with instance resolution also. The monoid solver is
-- here:
import Data.FingerTree.MonoidSolver
-- ┌──────────────────────────────────────────────────────────────────────────────┐
-- │ │
-- │ │
-- │ │
-- │ │
-- │ │
-- │ __ __ │
-- │ | \/ | ___ __ _ ___ _ _ _ __ ___ ___ │
-- │ | |\/| |/ _ \/ _` / __| | | | '__/ _ \/ __| │
-- │ | | | | __/ (_| \__ \ |_| | | | __/\__ \ │
-- │ |_| |_|\___|\__,_|___/\__,_|_| \___||___/ │
-- │ │
-- │ │
-- │ │
-- │ │
-- │ │
-- └──────────────────────────────────────────────────────────────────────────────┘
--
-- First things first, we need a way to talk about the summaries of elements we're
-- interested in. This is captured by the σ class, which has one method μ.
_ : ∀ {a} {A : Set a} ⦃ _ : σ A ⦄ → A → 𝓡
_ = μ
-- 𝓡 is the type of the summaries, and μ means "summarise". The silly symbols
-- are used for brevity: we're going to be using this thing everywhere, so it's
-- important to keep it short. Here's an example instance for lists:
_ : (xs : List A) → μ xs ≡ foldr (_∙_ ∘ μ) ε xs
_ = λ xs → refl
-- ┌──────────────────────────────────────────────────────────────────────────────┐
-- │ │
-- │ │
-- │ __ __ _ _ _ _ _ │
-- │ \ \ / /__ _ __| | _(_)_ __ __ _ __ _(_) |_| |__ │
-- │ \ \ /\ / / _ \| '__| |/ / | '_ \ / _` | \ \ /\ / / | __| '_ \ │
-- │ \ V V / (_) | | | <| | | | | (_| | \ V V /| | |_| | | | │
-- │ \_/\_/ \___/|_| |_|\_\_|_| |_|\__, | \_/\_/ |_|\__|_| |_| │
-- │ |___/ │
-- │ ____ _ _ _ │
-- │ / ___| ___| |_ ___ (_) __| |___ │
-- │ \___ \ / _ \ __/ _ \| |/ _` / __| │
-- │ ___) | __/ || (_) | | (_| \__ \ │
-- │ |____/ \___|\__\___/|_|\__,_|___/ │
-- │ │
-- │ │
-- └──────────────────────────────────────────────────────────────────────────────┘
-- As I mentioned, the tree is going to be verified intrinsically. In other word
-- its type will look something like this:
SimpleTree : Set _
SimpleTree = 𝓡 → Set
-- But before running off to define that the obvious way, I should mention that I
-- made the annoying decision to use a setoid (rather than propositional equality)
-- based monoid. This means that we don't get substitution, making the obvious
-- definition untenable.
-- I figured out a solution to the problem, but I'm not sure if I'm happy with it.
-- That's actually the main motivation for writing this post: I'm curious if other
-- people have better techniques for this kind of thing.
-- To clarify: "this kind of thing" is writing intrinsic (correct-by-construction)
-- proofs when a setoid is involved. Intrinsic proofs usually lend themselves to
-- elegance: to prove that `map` preserves a vector's length, for instance,
-- basically requires no proof at all:
map : ∀ {a b n} {A : Set a} {B : Set b}
→ (A → B)
→ Vec A n
→ Vec B n
map f [] = []
map f (x ∷ xs) = f x ∷ map f xs
-- But that's because pattern matching works well with propositional equality: in
-- the first clause, n is set to 0 automatically. If we were working with
-- setoid equality, we'd instead maybe get a proof that n ≈ 0, and we'd have to
-- figure a way to work that into the types.
open Measures hiding (arg-syntax; _>>=_; μ⟨_⟩≈_; 𝓢; 𝒻; _≈[_]; σ; Arg)
-- ┌──────────────────────────────────────────────────────────────────────────────┐
-- │ │
-- │ │
-- │ │
-- │ │
-- │ │
-- │ _____ _ _ │
-- │ | ___(_) |__ _ __ ___ ___ │
-- │ | |_ | | '_ \| '__/ _ \/ __| │
-- │ | _| | | |_) | | | __/\__ \ │
-- │ |_| |_|_.__/|_| \___||___/ │
-- │ │
-- │ │
-- │ │
-- │ │
-- │ │
-- └──────────────────────────────────────────────────────────────────────────────┘
module Fibre where
-- The first part of the solution is to define a wrapper type which stores
-- information about the size of the thing it contains:
record μ⟨_⟩≈_ {a} (Σ : Set a) ⦃ _ : σ Σ ⦄ (𝓂 : 𝓡) : Set (a ⊔ c ⊔ m) where
constructor _⇑[_]
field
𝓢 : Σ
𝒻 : μ 𝓢 ≈ 𝓂
open μ⟨_⟩≈_
-- Technically speaking, I think this is known as a "fibre". μ⟨ Σ ⟩≈ 𝓂 means "There
-- exists a Σ such that μ Σ ≈ 𝓂". Next, we'll need some combinators to work
-- with:
infixl 2 _≈[_]
_≈[_] : ∀ {a} {Σ : Set a} ⦃ _ : σ Σ ⦄ {x : 𝓡} → μ⟨ Σ ⟩≈ x → ∀ {y} → x ≈ y → μ⟨ Σ ⟩≈ y
𝓢 (xs ≈[ y≈z ]) = 𝓢 xs
𝒻 (xs ≈[ y≈z ]) = trans (𝒻 xs) y≈z
-- This makes it possible to "rewrite" the summary, given a proof of equivalence.
open Measures using (μ⟨_⟩≈_; 𝓢; 𝒻; _≈[_])
-- ┌──────────────────────────────────────────────────────────────────────────────┐
-- │ │
-- │ │
-- │ │
-- │ │
-- │ │
-- │ ____ _ _ _ _ _ │
-- │ | _ \ ___ | \ | | ___ | |_ __ _| |_(_) ___ _ __ │
-- │ | | | |/ _ \ | \| |/ _ \| __/ _` | __| |/ _ \| '_ \ │
-- │ | |_| | (_) | | |\ | (_) | || (_| | |_| | (_) | | | | │
-- │ |____/ \___/ |_| \_|\___/ \__\__,_|\__|_|\___/|_| |_| │
-- │ │
-- │ │
-- │ │
-- │ │
-- │ │
-- └──────────────────────────────────────────────────────────────────────────────┘
module ArgSyntax where
-- The wrapper on its own isn't enough to save us from hundreds of lines of proofs.
-- Once you do computation on its contents, you still need to join it up with its
-- original proof of equivalence. In other words, you'll need to drill into the
-- return type of a function, find the place you used the relevant type variable,
-- and apply the relevant proof from the type above. This can really clutter
-- proofs. Instead, we can use Agda's new support for do notation to try and get a
-- cleaner notation for everything. Here's a big block of code:
infixl 2 arg-syntax
record Arg {a} (Σ : Set a) ⦃ _ : σ Σ ⦄ (𝓂 : 𝓡) (f : 𝓡 → 𝓡) : Set (m ⊔ c ⊔ a) where
constructor arg-syntax
field
⟨f⟩ : Congruent₁ f
arg : μ⟨ Σ ⟩≈ 𝓂
open Arg
syntax arg-syntax (λ sz → e₁) xs = xs [ e₁ ⟿ sz ]
infixl 1 _>>=_
_>>=_ : ∀ {a b} {Σ₁ : Set a} {Σ₂ : Set b} ⦃ _ : σ Σ₁ ⦄ ⦃ _ : σ Σ₂ ⦄ {𝓂 f}
→ Arg Σ₁ 𝓂 f
→ ((x : Σ₁) → ⦃ x≈ : μ x ≈ 𝓂 ⦄ → μ⟨ Σ₂ ⟩≈ f (μ x))
→ μ⟨ Σ₂ ⟩≈ f 𝓂
arg-syntax cng xs >>= k = k (𝓢 xs) ⦃ 𝒻 xs ⦄ ≈[ cng (𝒻 xs) ]
-- First, we define a wrapper for types parameterised by their summary, with a way
-- to lift an underlying equality up into some expression f. The >>= operator
-- just connects up all of the relevant bits. An example is what's needed:
open Measures using (arg-syntax; _>>=_)
listToTree : ∀ {a} {Σ : Set a} ⦃ _ : σ Σ ⦄ → (xs : List Σ) → μ⟨ Tree Σ ⟩≈ μ xs
listToTree [] = empty ⇑
listToTree (x ∷ xs) = [ ℳ ↯ ]≈ do
ys ← listToTree xs [ μ x ∙> s ⟿ s ]
x ◂ ys
-- The first line is the base case, nothing interesting going on there. The second
-- line begins the do-notation, but first applies [ ℳ ↯ ]≈: this calls the
-- automated solver. The second line makes the recursive call, and with the syntax:
--
-- [ μ x ∙> s ⟿ s ]
--
-- It tells us where the size of the bound variable will end up in the outer
-- expression.
-- ┌──────────────────────────────────────────────────────────────────────────────┐
-- │ │
-- │ │
-- │ │
-- │ │
-- │ │
-- │ __ __ _ _ │
-- │ | \/ | ___ __| |_ _| | ___ ___ │
-- │ | |\/| |/ _ \ / _` | | | | |/ _ \/ __| │
-- │ | | | | (_) | (_| | |_| | | __/\__ \ │
-- │ |_| |_|\___/ \__,_|\__,_|_|\___||___/ │
-- │ │
-- │ │
-- │ │
-- │ │
-- │ │
-- └──────────────────────────────────────────────────────────────────────────────┘
-- Definition of Measures and so on
import Data.FingerTree.Measures
-- Tools for writing proofs
import Data.FingerTree.Reasoning
-- The finger tree type
import Data.FingerTree.Structures
-- Cons and Snoc
import Data.FingerTree.Cons
-- Uncons and Unsnoc
import Data.FingerTree.View
-- Split
import Data.FingerTree.Split
-- ┌──────────────────────────────────────────────────────────────────────────────┐
-- │ ___ __ │
-- │ | _ \___ / _|___ _ _ ___ _ _ __ ___ ___ │
-- │ | / -_) _/ -_) '_/ -_) ' \/ _/ -_|_-< │
-- │ |_|_\___|_| \___|_| \___|_||_\__\___/__/ │
-- │ │
-- └──────────────────────────────────────────────────────────────────────────────┘
-- [1]
-- Mu, Shin-Cheng, Yu-Hsi Chiang, and Yu-Han Lyu. 2016. “Queueing and Glueing for
-- Optimal Partitioning (Functional Pearl).” In Proceedings of the 21st ACM SIGPLAN
-- International Conference on Functional Programming, 158–167. ICFP 2016. New
-- York, NY, USA: ACM. doi:10.1145/2951913.2951923.
-- https://www.iis.sinica.edu.tw/~scm/pub/queueing-glueing.pdf.
-- [2]
-- Hinze, Ralf, and Ross Paterson. 2006. “Finger Trees: A Simple General-purpose
-- Data Structure.” Journal of Functional Programming 16 (2): 197–217.
-- http://www.staff.city.ac.uk/~ross/papers/FingerTree.html.
-- [3]
-- Hinze, Ralf. 2001. “A Simple Implementation Technique for Priority Search
-- Queues.” In Proceedings of the 2001 International Conference on Functional
-- Programming, 110–121. ACM Press. doi:10.1145/507635.507650.
-- https://www.cs.ox.ac.uk/people/ralf.hinze/publications/\#P14.
-- [4]
-- Sozeau, Matthieu. 2007. “Program-ing Finger Trees in Coq.” In Proceedings of the
-- 12th ACM SIGPLAN International Conference on Functional Programming, 13–24. ICFP
-- ’07. New York, NY, USA: ACM. doi:10.1145/1291151.1291156.
-- https://www.irif.fr/~sozeau/research/publications/Program-ing_Finger_Trees_in_Coq.pdf.
-- [5]
-- Danielsson, Nils Anders. 2008. “Lightweight Semiformal Time Complexity Analysis
-- for Purely Functional Data Structures.” In Proceedings of the 35th Annual ACM
-- SIGPLAN-SIGACT Symposium on Principles of Programming Languages, 133–144. POPL
-- ’08. New York, NY, USA: ACM. doi:10.1145/1328438.1328457.
-- http://www.cse.chalmers.se/~nad/publications/danielsson-popl2008.pdf.