I'm a senior at Brown University pursuing an Sc.B in Computer Science. My academic interests lie in applied cryptography, decentralized networks, data science and computer graphics. In my spare time, I enjoy traveling, exploring what lies beyond horizons, and practicing Ghanaian drumming and dancing.
Increasingly, studies across the scientific community are uncovered as being false discoveries. This phenomena is largely due to
“p-hacking” — the process of repeatedly testing hypotheses on a dataset until an interesting
result emerges. Since hypothesis testing inherently involves a small failure probability, given enough tests performed
on a single data set, a spurious correlation will emerge. This leads to obviously absured and unreplicatable claims being published in dubious news outlets (e.g., “A new study shows that drinking a glass of wine is just as good as spending an hour at the gym”) with the claim “p-value < 0.05!” but also affects reputable institutions and journals where more subtle forms of data dredging are present as well.
Accurately controlling for such errors is notoriously difficult since there are no existing means of tracking researcher bias during the testing procedure which leave many studies up to good faith. With the massive amounts of data available to researchers today, p-hacking is becoming a serious concern, with some journals outright refusing to consider studies involving p-values prior to having them reproduced by independent researchers (a costly and tedious endeavour).
My research has led me to develop a system called Custodes which provably certifies valid hypothesis testing procedures using cryptography. The system uses a novel approach of combining homomorphic encryption, secure multi-party computation, and distributed-ledgers to enforce the validity of statistical tests performed on data and as a result eliminates the possiblity for p-hacking to occur.
Data visualization tools such as Vizdom are highly interactive and require that results be displayed to users without compromising interaction. Unfortunately, numerous data mining algorithms are slow and require several seconds to minutes prior to producing a useful result to a given query. This leaves visualization tools with an ultimatum: either break all user interaction by running the algorithms on-the-spot or pre-compute all possible queries ahead of the exploration task (an unreasonable assumption in many cases).
My research involved developing "progressive" data mining algorithms which output many incremental and useful results, quickly converging on an exact output, while simultaneously providing strong error guarantees on each output. With the progressive algorithm, rather than waiting minutes for a result of a data mining query to be displayed, a high-quality, converging, approximation to the exact output is displayed every few milliseconds thus enabling true interaction between the data exploration task and the user.
BGN Homomorphic Encryption Scheme
Threshold-Paillier Homomorphic Encryption Scheme with ZKPs
An extended implementation of the Threshold-Paillier scheme with some other nice properties and proofs.
The scheme is based on the paper construction described in: Multiparty Computation from Threshold Homomorphic Encryption. The original repository can be found here.
Cryptographically Certified Hypothesis Testing,
Senior Honors Thesis, Brown University, 2018.
ProSecCo: Progressive Sequence Mining with Convergence Guarantees,
S. Servan-Schreiber, M. Riondato, and E. Zgraggen.
IEEE ICDM'18 2018
Best Student Paper runner-up Award
Towards Quantifying Uncertainty in Data Analysis & Exploration,
Y Chung, S Servan-Schreiber, E Zgraggen, T Kraska.
IEEE Data Engineering Bulletin, 41 (3), 2018
Links from around the web:
“Have nothing in your house that you do not know to be useful or believe to be beautiful.”— William Morris, Hopes and Fears for Art, 1882.
“Art is an outlet toward regions which are not ruled by time and space.” — Marcel Duchamp.
“Let the future tell the truth, and evaluate each one according to his work and accomplishments. The present is theirs; the future, for which I have really worked, is mine.” — Nikola Tesla.
“The man who has fed the chicken every day throughout its life at last wrings its neck instead, showing that more refined views as to the uniformity of nature would have been useful to the chicken.” — Bertrand Russell, The Problems of Philosophy, 1912.