Question:- What kind of a database is MongoDB?
Answer:- MongoDB is a document-oriented DBMS. We can think of it as MySQL but with JSON-like objects comprising the data model, rather than RDBMS tables. Significantly, MongoDB supports neither joins nor transactions. However, it features secondary indexes, an expressive query language, atomic writes on a per-document level, and fully-consistent reads. Operationally, MongoDB offers the master–slave replication with automated failover and built-in horizontal scaling via automated range-based partitioning.
Question:- Which language is MongoDB written in?
Answer:- MongoDB is implemented in C++. However, drivers and client libraries are typically written in their own respective languages. Although, some drivers use C extensions for better performance.
Question:- What are the limitations of the 32-bit versions of MongoDB?
Answer:- MongoDB uses memory-mapped files. When running a 32-bit build of MongoDB, the total storage size for the server, including data and indexes, is 2 GB. For this reason, we do not deploy MongoDB to production on 32-bit machines. If we’re running a 64-bit build of MongoDB, there’s virtually no limit to the storage size. For production deployments, 64-bit builds and operating systems are strongly recommended.
Question:- While creating a schema in MongoDB, what are the points need to be taken into consideration?
Answer:- While creating a schema in MongoDB, the points need to be taken care of are as follows: • Design our schema according to the user requirements • Combine objects into one document if we want to use them together; otherwise, separate them • Do joins while on write, and not when it is on read • For most frequent use cases, optimize the schema • Do complex aggregation in the schema
Question:- What is Apache HBase?
Answer:- It is a column-oriented database that is used to store sparse data sets. It is run on the top of the Hadoop file distributed system. Apache HBase is a database that runs on a Hadoop cluster. Clients can access HBase data through either a native Java API or through a Thrift or REST gateway, making it accessible by any language. Some of the key properties of HBase include: • NoSQL: HBase is not a traditional relational database (RDBMS). HBase relaxes the ACID (Atomicity, Consistency, Isolation, Durability) properties of traditional RDBMS systems in order to achieve much greater scalability. Data stored in HBase also does not need to fit into a rigid schema like with an RDBMS, making it ideal for storing unstructured or semi-structured data. • Wide-Column: HBase stores data in a table-like format with the ability to store billions of rows with millions of columns. Columns can be grouped together in “column families” which allows physical distribution of row values onto different cluster nodes. • Distributed and Scalable: HBase group rows into “regions” which define how table data is split over multiple nodes in a cluster. If a region gets too large, it is automatically split to share the load across more servers. • Consistent: HBase is architected to have “strongly-consistent” reads and writes, as opposed to other NoSQL databases that are “eventually consistent”. This means that once a writer has been performed, all read requests for that data will return the same value.
Question:- Compare HBase & Cassandra
Answer:- • HBase • Hadoop • Batch Jobs • REST/Thrift • Cassandra • Peer-to-peer • Data writes • Thrift
Question:- Give the name of the key components of HBase
Answer:- The key components of HBase are Zookeeper, RegionServer, Region, Catalog Tables and HBase Master.
Question:- What is S3?
Answer:- S3 stands for simple storage service and it is a one of the file system used by hbase.
Question:- What is the use of get() method?
Answer:- get() method is used to read the data from the table.
Question:- What is the reason of using HBase?
Answer:- HBase is used because it provides random read and write operations and it can perform a number of operation per second on a large data sets.
Question:- In how many modes HBase can run?
Answer:- There are two run modes of HBase i.e. standalone and distributed.
Question:- Define the difference between hive and HBase?
Answer:- HBase is used to support record level operations but hive does not support record level operations.
Question:- Define column families?
Answer:- It is a collection of columns whereas row is a collection of column families.
Question:- Define standalone mode in HBase?
Answer:- It is a default mode of HBase. In standalone mode, HBase does not use HDFS—it uses the local filesystem instead—and it runs all HBase daemons and a local ZooKeeper in the same JVM process.
