CS代写 Non-relational data and NoSQL

Non-relational data and NoSQL

Relational Databases
Relational DBs replaced file based data processing and offered huge improvements
Invented by . Codd in 1970s
Very rigid two-dimensional design for data organization
Based on extensive utilization of SQL
Only works with structured data
Great for management of transactional records

Departing from traditional design
Growth of Internet
Numerous applications
Hugely increased demand for information
Variety of types of data
Change in data storage
Processing speeds
Volume of data
Web3 ideas

https://www.freecodecamp.org/news/what-is-web3/

Storage Capacity Terms

Business Intelligence Systems
Business intelligence (BI) systems are information systems that:
assist managers and other professionals in the analysis of current and past activities and in the prediction of future events
do not support operational activities, such as the recording and processing of orders
these are supported by transaction processing systems
support management assessment, analysis, planning and control
BI systems fall into two broad categories:
reporting systems that sort, filter, group, and make elementary calculations on operational data
data mining applications that perform sophisticated analyses on data; analyses that usually involve complex statistical and mathematical processing

The Relationship Among
Operational and BI Applications

Components of a Data Warehouse

Data preparation
Problems with Operational Data
“Dirty data,” examples include:
“G” for gender, “213” for age
Missing values, inconsistent data
Nonintegrated data (data from multiple sources)
Incorrect format (ex: too many or not enough digits
Too much data (ex: an excess number of columns)
Data may need to be transformed for use in a data warehouse.
{CountryCode  CountryName}
“US”  “United States”
Email address to Email domain
 “somewhere.com”

Adding dimension to the data

Aggregated datasets

Multidimensional models

NoSQL was first used in 1998 by while naming his lightweight, open-source “relational” database that did not use SQL.
Created as a response to the needs of processing semi-structured, non-structured, and different kinds of data
Departed from 2 dimensional view of data

Can process
Structured
Unstructured
Unstructured Big Data
Utilized by companies such as Google, Twitter, LinkedIn, Facebook, etc.

Capabilities and Advantages
Can be purpose-built to specific data models
“Tableless” and opaque data storage
Can manage unstructured or multi-structured data
No need for a predefined schema
Better manage abstract data
Support graph data modeling
Support document-oriented data store
Less strict consistency (e.g. eventual consistency) models
Better operational performance
Require fewer computing resources
More horizontal and vertical scalability

CAP Theorem (Brewer’s theorem)
Deals with the management of non-relational databases:
The three guarantees that cannot be met simultaneously are:
Consistency
Availability
Partition Tolerance

CAP Principle
Consistency: The data within the database remains consistent, even after an operation has been executed. For instance, after updating a system, all clients will see the same data.
Availability: The system is constantly on (always available), with no downtime.
Partition Tolerance: Even if communication among the servers is no longer reliable, the system will continue to function. This is because the servers can be partitioned off, into multiple groups which can’t communicate with each other.

ACID and BASE Provide Consistency
ACID = Atomicity + Consistency + Isolation + Durability
This concept is used with non-relational DBs as well!
BASE (basically available, soft state, eventually consistent) approach is used for aggregate data stores and is an alternative and less rigid approach than ACID

BASE design
Deals with certain rate of failure acceptance across the partitioned databases
Data is decomposed into functional groups
Allows to support much higher volume of transactions
Allows for decentralized DB approach
Scalability and cost efficiency benefits

Non-relational data storage
Schema-free and non-relational
Allows rapid changes and replication
Horizontally scalable
NoSQL uses data stores optimized for specific purposes
Four storage categories
Key-Value storage
Document storage
Wide Column storage
Graph database

Key-Value stores
Each key is associated with only one value in a collection
Dictionary of key-value pairs
Variety of options for data type classifications
Simplest database types among NoSQL databases

Document-oriented stores
Focus is on storage document-oriented information (semi-structured data)
Pairs keys with document-type data structure which maps keys to the documents
Does not require the data to be split over the tables

Wide-Column Stores
Used table and column-row approach
Names and data format can vary between the rows in the same table
Uses a grouping of columns referred to as families (referred to as Column Family DBs)

Graph Stores
Uses graph-based structures
Uses nodes, edges, and properties to organize and store data
Each node can represent an entity (object) and is connected by an edge (edges) to other nodes to form relationships
Each node has a unique identifier
Each edge also has a unique identifier
Allows to establish a network of connections

Graph Stores

Relational Vs. Non-relational DB
Relational Non-relational
Very rigid structure Flexibility in structure
Does not accommodate all modern needs for data organization Plenty of flexible options
Used primary to FK connections, relies on atomic attributes, stronger mechanisms to enforce business logic Allows for complex and custom data stores, relies on key-value pairs where value does not have to be an atomic attribute
Build to enforce referential integrity Cannot enforce relationships between items
Harder to scale Easy horizontal scalability

Popular Non-Relational/NoSQL Databases
AmazonDynamoDB
IBM Cloudant
And many more …
https://www.trustradius.com/nosql-databases

Database classification

“Hadoop is a highly scalable analytics platform for processing large volumes of structured and unstructured data. Multiple petabytes of data spread across hundreds or thousands of physical storage servers or nodes.”

Future of databases

References:

A Brief History of Non-Relational Databases

https://docs.microsoft.com/en-us/azure/architecture/data-guide/big-data/non-relational-data
https://www.trustradius.com/nosql-databases

Relational databases vs Non-relational databases

https://aws.amazon.com/nosql/

程序代写 CS代考加微信: powcoder QQ: 1823890830 Email: powcoder@163.com

Related Posts