I work at Apple on database systems for iCloud, including the FoundationDB Record Layer. Before joining Apple, I was a graduate student in computer science at MIT, where I worked on sketching problems in machine learning and was privileged to be advised by Piotr Indyk.
I am broadly interested in the design, analysis, and application of algorithms, especially when they involve randomness and applications to large data sets.Previously, I graduated from Caltech with a degree in computer science in 2016.
My CV is available.
Christos Chrysafis, Ben Collins, Scott Dugas, Jay Dunkelberger, Moussa Ehsan, Scott Gray, Alec Grieser, Ori Herrnstadt, Kfir Lev-Ari, Tao Lin, Mike McMahon, Nicholas Schiefer, and Alexander Shraer, “FoundationDB Record Layer: A Multi-Tenant Structured Datastore”, to appear in 2019 International Conference on Management of Data, Industry Track (SIGMOD 2019), 2019.
The FoundationDB Record Layer is an open source library that provides a record-oriented data store with semantics similar to a relational database implemented on top of FoundationDB, an ordered, transactional key-value store. The Record Layer provides a lightweight, highly extensible way to store structured data. It offers schema management and a rich set of query and indexing facilities, some of which are not usually found in traditional relational databases, such as nested record types, indexes on commit versions, and indexes that span multiple record types. The Record Layer is stateless and built for massive multi-tenancy, encapsulating and isolating all of a tenant's state, including indexes, into a separate logical database. We demonstrate how the Record Layer is used by CloudKit, Apple's cloud backend service, to provide powerful abstractions to applications serving hundreds of millions of users. CloudKit uses the Record Layer to host billions of independent databases, many with a common schema. Features provided by the Record Layer enable CloudKit to provide richer APIs and stronger semantics with reduced maintenance overhead and improved scalability.
Peter Ahrens, Helen Xu, and Nicholas Schiefer, “A Fill Estimation Algorithm for Sparse Matrices and Tensors in Blocked Formats”, 2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS 2018), 2018.
Many sparse matrices and tensors from a variety of applications, such as finite element methods and computational chemistry, have a natural aligned rectangular nonzero block structure. Researchers have designed high-performance blocked sparse operations which can take advantage of this sparsity structure to reduce the complexity of storing the locations of nonzeros. The performance of a blocked sparse operation depends on how well the block size reflects the structure of nonzeros in the tensor. Sparse tensor structure is generally unknown until runtime, so block size selection must be efficient. The fill is a quantity which, for some block size, relates the number of nonzero blocks to the number of nonzeros. Many performance models use the fill to help choose a block size. However, the fill is expensive to compute exactly. We present a sampling-based algorithm called Phil to estimate the fill of sparse matrices and tensors in any format. We provide theoretical guarantees for sparse matrices and tensors, and experimental results for matrices. The existing state-of-the-art fill estimation algorithm, which we will call OSKI, runs in time linear in the number of elements in the tensor. The number of samples Phil needs to compute a fill estimate is unrelated to the number of nonzeros and depends only on the order (number of dimensions) of the tensor, desired accuracy of the estimate, desired probability of achieving this accuracy, and number of considered block sizes. We compare Phil and OSKI on a suite of 42 matrices. On most inputs, Phil estimates the fill at least 2 times faster and often more than 20 times faster than OSKI. Phil consistently produced accurate estimates; in all cases that we tested Phil was faster and/or more accurate than OSKI. Finally, we find that Phil and OSKI produce comparable speedups in multicore blocked sparse matrix-vector multiplication (SpMV) when the block size was chosen using fill estimates in a model due to Vuduc et al.
Nicholas Schiefer and Erik Winfree, “Time Complexity of Computation and Construction in the Chemical Reaction Network-Controlled Tile Assembly Model”, 22nd International Conference on DNA Computing and Molecular Programming (DNA22), 2016.
In isolation, chemical reaction networks and tile-based self-assembly are well-studied models of chemical computation. Previously, we introduced the chemical reaction network-controlled tile assembly model (CRN-TAM), in which a stochastic chemical reaction network can act as a non-local control and signalling system for tile-based assembly, and showed that the CRN-TAM can perform several tasks related to the simulation of Turing machines and construction of algorithmic shapes with lower space or program complexity than in either of its parent models. Here, we introduce a kinetic variant of the CRN-TAM and investigate the time complexity of computation and construction. We analyze the time complexity of decision problems in the CRN-TAM, and show that decidable languages can be decided as efficiently by CRN-TAM programs as by Turing machines. We also give a lower bound for the space-time complexity of CRN-TAM computation that rules out efficient parallel stack machines. We provide efficient parallel implementations of non-deterministic computations, showing among other things that CRN-TAM programs can decide languages in NTIME ∩ coNTIME(f(n)) in O(f(n) + n + log c) time with (1 - exp (-c)) probability, using volume exponential in n. Lastly, we provide basic mechanisms for parallel computations that share information and illustrate the limits of parallel computation in the CRN-TAM.
Nicholas Schiefer and Erik Winfree, “Universal Computation and Optimal Construction in the Chemical Reaction Network-Controlled Tile Assembly Model”, 21st International Conference on DNA Computing and Molecular Programming (DNA21), 2015.
Tile-based self-assembly and chemical reaction networks provide two well-studied models of scalable DNA-based computation. Although tile self-assembly provides a powerful framework for describing Turing-universal self-assembling systems, assembly logic in tile self-assembly is localized, so that only the nearby environment can affect the process of self-assembly. We introduce a new model of tile-based self-assembly in which a well-mixed chemical reaction network interacts with self-assembling tiles to exert non-local control on the self-assembly process. Through simulation of multi-stack machines, we demonstrate that this new model is efficiently Turing-universal, even when restricted to unbounded space in only one spatial dimension. Using a natural notion of program complexity, we also show that this new model can produce many complex shapes with programs of lower complexity. Most notably, we show that arbitrary connected shapes can be produced by a program with complexity bounded by the Kolmogorov complexity of the shape, without the large scale factor that is required for the analogous result in the abstract tile assembly model. These results suggest that controlled self-assembly provides additional algorithmic power over tile-only self-assembly, and that non-local control enhances our ability to perform computation and algorithmically self-assemble structures from small input programs.