Basic Knowledge
No SQL
Categories
–Key-value: AWS DynamoDB, Azure Tables, Google Cloud Datastore, Riak, Redis
–Column-family/BigTable: Cassandra, HBase, Hypertable
–Document: MongoDB, CouchDB, AWS DynamoDB
–Graph: Neo4j
Redux
Redux is an open-sourceJavaScript library for managing application state. It is most commonly used with libraries such as React or Angular for building user interfaces
Spark
Apache Spark is an open-source cluster-computing framework. Spark's big claim to fame is its real-time data processing capability as compared to MapReduce's disk-bound, batch processing engine.
Hadoop
HDFS and YARN form the data management layer of Apache Hadoop. YARN is the architectural center of Hadoop, the resource management framework that enables the enterprise to process data in multiple ways simultaneously—for batch, interactive and real-time data workloads on one shared dataset. YARN provides the resource management and HDFS provides the scalable, fault-tolerant, cost-efficient storage for big data.
CAP theorem
- Consistency: Every read receives the most recent write or an error
- Availability: Every request receives a (non-error) response – without guarantee that it contains the most recent write
- Partition tolerance: The system continues to operate despite an arbitrary number of messages being dropped (or delayed) by the network between nodes