Big data refers to large volumes of diverse data that are created quickly and in large quantities. It is characterized by 3 V's - volume, velocity, and variety. Volume refers to the large amount of data, velocity refers to the speed at which data is generated and processed, and variety refers to different types of data including structured, unstructured, and semi-structured data. Big data is generated from various sources like users, sensors, applications and requires distributed storage, processing using tools like Hadoop and MapReduce. Analyzing big data can provide competitive advantages through insights from hidden patterns, better decision making and improved business operations. Programming languages like Python, Java, R, Scala are commonly used for big data applications.