Hi! My name is Sacha. I recently graduated Brown University in December '18 with an Sc.B in Computer Science. My academic and research interests are in applied cryptography, decentralized networks, data science, and computer graphics. In my spare time, I enjoy traveling, exploring what lies beyond horizons, and practicing Ghanaian drumming and dancing.

My Erdős number is 4 ( ErdősSuenUpfalRiondato → Sacha).

GitHub is sachaservan.


Certifying Scientific Studies with Cryptography

Senior thesis: Cryptographically Certified Hypothesis Testing.
Technical report: Custodes: Auditable Hypothesis Testing.

Increasingly, studies across the scientific community are uncovered as being false discoveries. This phenomena is largely due to “p-hacking” — the process of repeatedly testing hypotheses on a dataset until an interesting result emerges. Since hypothesis testing inherently involves a small failure probability, given enough tests performed on a single data set, a spurious correlation will emerge. This leads to obviously absurd and unreplicatable claims being published in dubious news outlets (e.g., “A new study shows that drinking a glass of wine is just as good as spending an hour at the gym”) with the claim “p-value < 0.05!” but also affects reputable institutions and journals where more subtle forms of data dredging are present as well.

Accurately controlling for such errors is notoriously difficult since there are no existing means of tracking researcher bias during the testing procedure which leave many studies up to good faith. With the massive amounts of data available to researchers today, p-hacking is becoming a serious concern, with some journals outright refusing to consider studies involving p-values prior to having them reproduced by independent researchers (a costly and tedious endeavor).

My research has led me to develop a system called Custodes which provably certifies valid hypothesis testing procedures using cryptography. The system uses a novel approach of combining homomorphic encryption, secure multi-party computation, and distributed-ledgers to enforce the validity of statistical tests performed on data and as a result eliminates the possibility for p-hacking to occur.

Progressive Sequence Mining Algorithms

Conference paper: ProSecCo: Progressive Sequence Mining with Convergence Guarantees.

Data visualization tools such as Vizdom are highly interactive and require that results be displayed to users without compromising interaction. Unfortunately, numerous data mining algorithms are slow and require several seconds to minutes prior to producing a useful result to a given query. This leaves visualization tools with an ultimatum: either break all user interaction by running the algorithms on-the-spot or pre-compute all possible queries ahead of the exploration task (an unreasonable assumption in many cases).

My research involved developing "progressive" data mining algorithms which output many incremental and useful results, quickly converging on an exact output, while simultaneously providing strong error guarantees on each output. With the progressive algorithm, rather than waiting minutes for a result of a data mining query to be displayed, a high-quality, converging, approximation to the exact output is displayed every few milliseconds thus enabling true interaction between the data exploration task and the user.


BGN Homomorphic Encryption Scheme

My implementation of the BGN homomorphic encryption scheme. Written in Go. The implementation is based on the construction described in: Evaluating 2-DNF Formulas on Ciphertexts.

Threshold-Paillier Homomorphic Encryption Scheme with ZKPs

An extended implementation of the Threshold-Paillier scheme with some other nice properties and proofs. Written in Go. The scheme is based on the paper construction described in: Multiparty Computation from Threshold Homomorphic Encryption. The original repository can be found here.

ProSecCo Sequence Mining Algorithm

An implementation of the ProSecCo sequence mining algorithm which is described in my recent paper

The C# version of the algorithm is on GitHub.

The Java version of the algorithm can be found in the SPMF data mining library.


Conference Papers
  1. ProSecCo: Progressive Sequence Mining with Convergence Guarantees,
    S. Servan-Schreiber, M. Riondato, and E. Zgraggen.
    IEEE ICDM'18 2018
    Best Student Paper runner-up Award

  2. Towards Quantifying Uncertainty in Data Analysis & Exploration,
    Y. Chung, S. Servan-Schreiber, E. Zgraggen, and T. Kraska.
    IEEE Data Engineering Bulletin, 41 (3), 2018
Technical Reports
  1. Custodes: Auditable Hypothesis Testing,
    S. Servan-Schreiber, O. Ohrimenko, T. Kraska, and E. Zgraggen.

  2. ProSecCo: Progressive Sequence Mining with Convergence Guarantees,
    S. Servan-Schreiber, M. Riondato, and E. Zgraggen.
    (Journal version)
Undergraduate Senior Thesis
  1. Cryptographically Certified Hypothesis Testing,
    S. Servan-Schreiber.
    Senior Honors Thesis, Brown University, 2018.

Other Projects


Earlier this year, I co-founded Meditect, a startup based in France with the goal of curbing the fake medication sold in France and countries importing french pharmaceuticals (primarily francophone countries in Africa). France is issuing new regulation this year which forces individual serial numbers per box of medication being produced. Meditect takes advantage of this regulation by placing itself as a certificate authority, tracking exports along the supply-chain and fighting counterfeit medication markets.
We are working with several manufacturers and suppliers in France to help track the supply chain and provide a system for consumers to verify the authenticity of their prescriptions.


