Big Data Systems (WT 2020/21)

Big Data Systems (WT 2020/21) - tele-TASKhttps://www.tele-task.de/series/1327/The amount of data that can be generated and stored in academic and industrial projects and applications is increasing rapidly. Big data analytics technologies have established themselves as a solution for big data challenges to the scalability problems of traditional database systems. The vast amounts of new data that is collected, however, usually is not as easily analyzed as curated, structured data in a data warehouse is. Typically, these data are noisy, of varying format and velocity, and need to be analyzed with techniques from statistics and machine learning rather than pure SQL-like aggregations and drill-downs. Moreover, the results of the analyses frequently are models that are used for decision making and prediction. The complete process of big data analysis is described as a pipeline, which includes data recording, cleaning, integration, modeling, and interpretation. In this lecture, we will discuss big data systems, i.e., infrastructures that are used to handle all steps in typical big data processing pipelines. We will learn about data center infrastructure and scale-out software systems. The software discussed will cover the full big data stack, i.e., distributed file systems, Map Reduce, key value stores, stream processing, graph processing, ML systems.High quality e-learning content created with tele-TASK - more than video! Powered by Hasso Plattner Institute (HPI)Prof. Dr. Tilmann RablThe amount of data that can be generated and stored in academic and industrial projects and applications is increasing rapidly. Big data analytics technologies have established themselves as a solution for big data challenges to the scalability problems of traditional database systems. The vast amounts of new data that is collected, however, usually is not as easily analyzed as curated, structured data in a data warehouse is. Typically, these data are noisy, of varying format and velocity, and need to be analyzed with techniques from statistics and machine learning rather than pure SQL-like aggregations and drill-downs. Moreover, the results of the analyses frequently are models that are used for decision making and prediction. The complete process of big data analysis is described as a pipeline, which includes data recording, cleaning, integration, modeling, and interpretation. In this lecture, we will discuss big data systems, i.e., infrastructures that are used to handle all steps in typical big data processing pipelines. We will learn about data center infrastructure and scale-out software systems. The software discussed will cover the full big data stack, i.e., distributed file systems, Map Reduce, key value stores, stream processing, graph processing, ML systems.notele-TASKtele-task@hpi.deen℗; ©; tele-TASKWed, 24 Apr 2024 03:34:38 GMTPyRSS2Gen-1.1.0http://blogs.law.harvard.edu/tech/rssFair Benchmarkinghttps://www.tele-task.de/lecture/video/8530/enProf. Dr. Tilmann Rabl00:12:00tele-TASK, HPI, computer science, technology, Germany, PotsdamProf. Dr. Tilmann RablProf. Dr. Tilmann Rablhttps://www.tele-task.de/lecture/video/8530/Mon, 01 Feb 2021 11:30:00 GMTBigBench / TPCx-BB - Big Data Benchmarkhttps://www.tele-task.de/lecture/video/8529/enProf. Dr. Tilmann Rabl00:13:48tele-TASK, HPI, computer science, technology, Germany, PotsdamProf. Dr. Tilmann RablProf. Dr. Tilmann Rablhttps://www.tele-task.de/lecture/video/8529/Mon, 01 Feb 2021 11:15:00 GMTSort Benchmarkshttps://www.tele-task.de/lecture/video/8528/enProf. Dr. Tilmann Rabl00:16:25tele-TASK, HPI, computer science, technology, Germany, PotsdamProf. Dr. Tilmann RablProf. Dr. Tilmann Rablhttps://www.tele-task.de/lecture/video/8528/Mon, 01 Feb 2021 10:55:00 GMTBenchmarkshttps://www.tele-task.de/lecture/video/8527/enProf. Dr. Tilmann Rabl00:09:28tele-TASK, HPI, computer science, technology, Germany, PotsdamProf. Dr. Tilmann RablProf. Dr. Tilmann Rablhttps://www.tele-task.de/lecture/video/8527/Mon, 01 Feb 2021 10:45:00 GMTSome Statisticshttps://www.tele-task.de/lecture/video/8526/enProf. Dr. Tilmann Rabl00:47:45tele-TASK, HPI, computer science, technology, Germany, PotsdamProf. Dr. Tilmann RablProf. Dr. Tilmann Rablhttps://www.tele-task.de/lecture/video/8526/Mon, 01 Feb 2021 09:55:00 GMTMeasurements & Metricshttps://www.tele-task.de/lecture/video/8525/enProf. Dr. Tilmann Rabl00:14:22tele-TASK, HPI, computer science, technology, Germany, PotsdamProf. Dr. Tilmann RablProf. Dr. Tilmann Rablhttps://www.tele-task.de/lecture/video/8525/Mon, 01 Feb 2021 09:40:00 GMTBack of the Envelope Calculationhttps://www.tele-task.de/lecture/video/8524/enProf. Dr. Tilmann Rabl00:12:06tele-TASK, HPI, computer science, technology, Germany, PotsdamProf. Dr. Tilmann RablProf. Dr. Tilmann Rablhttps://www.tele-task.de/lecture/video/8524/Mon, 01 Feb 2021 09:25:00 GMTIntroductionhttps://www.tele-task.de/lecture/video/8523/enProf. Dr. Tilmann Rabl00:08:49tele-TASK, HPI, computer science, technology, Germany, PotsdamProf. Dr. Tilmann RablProf. Dr. Tilmann Rablhttps://www.tele-task.de/lecture/video/8523/Mon, 01 Feb 2021 09:15:00 GMTA Brief Introduction to RDMAshttps://www.tele-task.de/lecture/video/8516/enPedro Silva00:22:00tele-TASK, HPI, computer science, technology, Germany, PotsdamPedro SilvaPedro Silvahttps://www.tele-task.de/lecture/video/8516/Tue, 26 Jan 2021 10:45:00 GMTIntro to Persistent Memory IIhttps://www.tele-task.de/lecture/video/8515/enLawrence Benson00:20:25tele-TASK, HPI, computer science, technology, Germany, PotsdamLawrence BensonLawrence Bensonhttps://www.tele-task.de/lecture/video/8515/Tue, 26 Jan 2021 10:25:00 GMTIntro to Persistent Memory Ihttps://www.tele-task.de/lecture/video/8514/enLawrence Benson00:10:23tele-TASK, HPI, computer science, technology, Germany, PotsdamLawrence BensonLawrence Bensonhttps://www.tele-task.de/lecture/video/8514/Tue, 26 Jan 2021 10:15:00 GMTData Processing on GPUshttps://www.tele-task.de/lecture/video/8513/enIlin Tolovski00:37:26tele-TASK, HPI, computer science, technology, Germany, PotsdamIlin TolovskiIlin Tolovskihttps://www.tele-task.de/lecture/video/8513/Tue, 26 Jan 2021 09:35:00 GMTModern Hardware Ihttps://www.tele-task.de/lecture/video/8512/enProf. Dr. Tilmann Rabl00:19:31tele-TASK, HPI, computer science, technology, Germany, PotsdamProf. Dr. Tilmann RablProf. Dr. Tilmann Rablhttps://www.tele-task.de/lecture/video/8512/Tue, 26 Jan 2021 09:15:00 GMTData-Parallel Parameter Serverhttps://www.tele-task.de/lecture/video/8474/deProf. Dr. Tilmann Rabl00:22:05tele-TASK, HPI, computer science, technology, Germany, PotsdamProf. Dr. Tilmann RablProf. Dr. Tilmann Rablhttps://www.tele-task.de/lecture/video/8474/Mon, 18 Jan 2021 11:55:00 GMTExecution Strategieshttps://www.tele-task.de/lecture/video/8480/deProf. Dr. Tilmann Rabl00:44:08tele-TASK, HPI, computer science, technology, Germany, PotsdamProf. Dr. Tilmann RablProf. Dr. Tilmann Rablhttps://www.tele-task.de/lecture/video/8480/Mon, 18 Jan 2021 11:10:00 GMTSystemMLhttps://www.tele-task.de/lecture/video/8473/deProf. Dr. Tilmann Rabl00:28:52tele-TASK, HPI, computer science, technology, Germany, PotsdamProf. Dr. Tilmann RablProf. Dr. Tilmann Rablhttps://www.tele-task.de/lecture/video/8473/Mon, 18 Jan 2021 10:40:00 GMTLanguage Abstraction & System Architectureshttps://www.tele-task.de/lecture/video/8472/deProf. Dr. Tilmann Rabl00:18:46tele-TASK, HPI, computer science, technology, Germany, PotsdamProf. Dr. Tilmann RablProf. Dr. Tilmann Rablhttps://www.tele-task.de/lecture/video/8472/Mon, 18 Jan 2021 10:20:00 GMTML System Stackhttps://www.tele-task.de/lecture/video/8471/deProf. Dr. Tilmann Rabl00:28:02tele-TASK, HPI, computer science, technology, Germany, PotsdamProf. Dr. Tilmann RablProf. Dr. Tilmann Rablhttps://www.tele-task.de/lecture/video/8471/Mon, 18 Jan 2021 09:50:00 GMTMachine Learning Modelshttps://www.tele-task.de/lecture/video/8470/deProf. Dr. Tilmann Rabl00:15:58tele-TASK, HPI, computer science, technology, Germany, PotsdamProf. Dr. Tilmann RablProf. Dr. Tilmann Rablhttps://www.tele-task.de/lecture/video/8470/Mon, 18 Jan 2021 09:35:00 GMTIntroductionhttps://www.tele-task.de/lecture/video/8469/deProf. Dr. Tilmann Rabl00:18:29tele-TASK, HPI, computer science, technology, Germany, PotsdamProf. Dr. Tilmann RablProf. Dr. Tilmann Rablhttps://www.tele-task.de/lecture/video/8469/Mon, 18 Jan 2021 09:15:00 GMTFederated Machine Learninghttps://www.tele-task.de/lecture/video/8475/deProf. Dr. Tilmann Rabl00:13:48tele-TASK, HPI, computer science, technology, Germany, PotsdamProf. Dr. Tilmann RablProf. Dr. Tilmann Rablhttps://www.tele-task.de/lecture/video/8475/Mon, 18 Jan 2021 00:20:00 GMTGraph Databaseshttps://www.tele-task.de/lecture/video/8451/enPedro Silva02:59:18tele-TASK, HPI, computer science, technology, Germany, PotsdamPedro SilvaPedro Silvahttps://www.tele-task.de/lecture/video/8451/Thu, 07 Jan 2021 09:15:00 GMTRelational & Big-Data Processing in the Enterprise - Bridging the Gaphttps://www.tele-task.de/lecture/video/8448/enDr. Alexander Böhm00:57:17tele-TASK, HPI, computer science, technology, Germany, PotsdamDr. Alexander BöhmDr. Alexander Böhmhttps://www.tele-task.de/lecture/video/8448/Wed, 06 Jan 2021 09:15:00 GMTMapReduce IIhttps://www.tele-task.de/lecture/video/8405/enProf. Dr. Tilmann Rabl00:13:12tele-TASK, HPI, computer science, technology, Germany, PotsdamProf. Dr. Tilmann RablProf. Dr. Tilmann Rablhttps://www.tele-task.de/lecture/video/8405/Mon, 30 Nov 2020 09:15:00 GMT(Big Data) File Formathttps://www.tele-task.de/lecture/video/8394/enProf. Dr. Tilmann Rabl00:13:57tele-TASK, HPI, computer science, technology, Germany, PotsdamProf. Dr. Tilmann RablProf. Dr. Tilmann Rablhttps://www.tele-task.de/lecture/video/8394/Wed, 25 Nov 2020 10:55:00 GMTErasure Codinghttps://www.tele-task.de/lecture/video/8393/enProf. Dr. Tilmann Rabl00:09:46tele-TASK, HPI, computer science, technology, Germany, PotsdamProf. Dr. Tilmann RablProf. Dr. Tilmann Rablhttps://www.tele-task.de/lecture/video/8393/Wed, 25 Nov 2020 10:45:00 GMTHadoop Distributed File Systemhttps://www.tele-task.de/lecture/video/8392/enProf. Dr. Tilmann Rabl00:11:03tele-TASK, HPI, computer science, technology, Germany, PotsdamProf. Dr. Tilmann RablProf. Dr. Tilmann Rablhttps://www.tele-task.de/lecture/video/8392/Wed, 25 Nov 2020 10:30:00 GMTGoogle File Systemhttps://www.tele-task.de/lecture/video/8391/enProf. Dr. Tilmann Rabl00:20:13tele-TASK, HPI, computer science, technology, Germany, PotsdamProf. Dr. Tilmann RablProf. Dr. Tilmann Rablhttps://www.tele-task.de/lecture/video/8391/Wed, 25 Nov 2020 10:10:00 GMTNetwork File Systemhttps://www.tele-task.de/lecture/video/8390/enProf. Dr. Tilmann Rabl00:16:52tele-TASK, HPI, computer science, technology, Germany, PotsdamProf. Dr. Tilmann RablProf. Dr. Tilmann Rablhttps://www.tele-task.de/lecture/video/8390/Wed, 25 Nov 2020 09:40:00 GMTBasics of File Systemshttps://www.tele-task.de/lecture/video/8389/enProf. Dr. Tilmann Rabl00:22:37tele-TASK, HPI, computer science, technology, Germany, PotsdamProf. Dr. Tilmann RablProf. Dr. Tilmann Rablhttps://www.tele-task.de/lecture/video/8389/Wed, 25 Nov 2020 09:15:00 GMTCloud Applicationshttps://www.tele-task.de/lecture/video/8381/enProf. Dr. Tilmann Rabl00:12:00tele-TASK, HPI, computer science, technology, Germany, PotsdamProf. Dr. Tilmann RablProf. Dr. Tilmann Rablhttps://www.tele-task.de/lecture/video/8381/Mon, 23 Nov 2020 10:45:00 GMTCloud Computinghttps://www.tele-task.de/lecture/video/8380/enProf. Dr. Tilmann Rabl00:14:12tele-TASK, HPI, computer science, technology, Germany, PotsdamProf. Dr. Tilmann RablProf. Dr. Tilmann Rablhttps://www.tele-task.de/lecture/video/8380/Mon, 23 Nov 2020 10:30:00 GMTSchedulinghttps://www.tele-task.de/lecture/video/8379/enProf. Dr. Tilmann Rabl00:20:53tele-TASK, HPI, computer science, technology, Germany, PotsdamProf. Dr. Tilmann RablProf. Dr. Tilmann Rablhttps://www.tele-task.de/lecture/video/8379/Mon, 23 Nov 2020 10:10:00 GMTVirtualizationhttps://www.tele-task.de/lecture/video/8378/enProf. Dr. Tilmann Rabl00:16:29tele-TASK, HPI, computer science, technology, Germany, PotsdamProf. Dr. Tilmann RablProf. Dr. Tilmann Rablhttps://www.tele-task.de/lecture/video/8378/Mon, 23 Nov 2020 09:50:00 GMTIntroductionhttps://www.tele-task.de/lecture/video/8377/enProf. Dr. Tilmann Rabl00:24:27tele-TASK, HPI, computer science, technology, Germany, PotsdamProf. Dr. Tilmann RablProf. Dr. Tilmann Rablhttps://www.tele-task.de/lecture/video/8377/Mon, 23 Nov 2020 09:15:00 GMTMR Algorithmshttps://www.tele-task.de/lecture/video/8358/enProf. Dr. Tilmann Rabl00:11:26tele-TASK, HPI, computer science, technology, Germany, PotsdamProf. Dr. Tilmann RablProf. Dr. Tilmann Rablhttps://www.tele-task.de/lecture/video/8358/Mon, 16 Nov 2020 10:25:00 GMTMapReduce Architecturehttps://www.tele-task.de/lecture/video/8357/enProf. Dr. Tilmann Rabl00:21:51tele-TASK, HPI, computer science, technology, Germany, PotsdamProf. Dr. Tilmann RablProf. Dr. Tilmann Rablhttps://www.tele-task.de/lecture/video/8357/Mon, 16 Nov 2020 10:00:00 GMTSorting in Detailhttps://www.tele-task.de/lecture/video/8356/enProf. Dr. Tilmann Rabl00:16:25tele-TASK, HPI, computer science, technology, Germany, PotsdamProf. Dr. Tilmann RablProf. Dr. Tilmann Rablhttps://www.tele-task.de/lecture/video/8356/Mon, 16 Nov 2020 09:40:00 GMTMap - Sort - Reducehttps://www.tele-task.de/lecture/video/8355/enProf. Dr. Tilmann Rabl00:14:47tele-TASK, HPI, computer science, technology, Germany, PotsdamProf. Dr. Tilmann RablProf. Dr. Tilmann Rablhttps://www.tele-task.de/lecture/video/8355/Mon, 16 Nov 2020 09:25:00 GMTIntroductionhttps://www.tele-task.de/lecture/video/8354/enProf. Dr. Tilmann Rabl00:06:23tele-TASK, HPI, computer science, technology, Germany, PotsdamProf. Dr. Tilmann RablProf. Dr. Tilmann Rablhttps://www.tele-task.de/lecture/video/8354/Mon, 16 Nov 2020 09:15:00 GMTFurther Evolutionhttps://www.tele-task.de/lecture/video/8343/deProf. Dr. Tilmann Rabl00:07:22tele-TASK, HPI, computer science, technology, Germany, PotsdamProf. Dr. Tilmann RablProf. Dr. Tilmann Rablhttps://www.tele-task.de/lecture/video/8343/Fri, 06 Nov 2020 10:20:00 GMTOpen Source Big Data Stackhttps://www.tele-task.de/lecture/video/8342/deProf. Dr. Tilmann Rabl00:15:59tele-TASK, HPI, computer science, technology, Germany, PotsdamProf. Dr. Tilmann RablProf. Dr. Tilmann Rablhttps://www.tele-task.de/lecture/video/8342/Fri, 06 Nov 2020 10:00:00 GMTGoogle's Big Data Stackhttps://www.tele-task.de/lecture/video/8341/deProf. Dr. Tilmann Rabl00:19:42tele-TASK, HPI, computer science, technology, Germany, PotsdamProf. Dr. Tilmann RablProf. Dr. Tilmann Rablhttps://www.tele-task.de/lecture/video/8341/Fri, 06 Nov 2020 09:40:00 GMTThe Big Data Stackhttps://www.tele-task.de/lecture/video/8340/deProf. Dr. Tilmann Rabl00:10:15tele-TASK, HPI, computer science, technology, Germany, PotsdamProf. Dr. Tilmann RablProf. Dr. Tilmann Rablhttps://www.tele-task.de/lecture/video/8340/Fri, 06 Nov 2020 09:30:00 GMTIntroductionhttps://www.tele-task.de/lecture/video/8330/enProf. Dr. Tilmann Rabl00:14:49tele-TASK, HPI, computer science, technology, Germany, PotsdamProf. Dr. Tilmann RablProf. Dr. Tilmann Rablhttps://www.tele-task.de/lecture/video/8330/Fri, 06 Nov 2020 09:15:00 GMTIntroductionhttps://www.tele-task.de/lecture/video/8322/enProf. Dr. Tilmann Rabl01:18:38tele-TASK, HPI, computer science, technology, Germany, PotsdamProf. Dr. Tilmann RablProf. Dr. Tilmann Rablhttps://www.tele-task.de/lecture/video/8322/Tue, 03 Nov 2020 09:15:00 GMTWelcomehttps://www.tele-task.de/lecture/video/8309/enProf. Dr. Tilmann Rabl00:02:09tele-TASK, HPI, computer science, technology, Germany, PotsdamProf. Dr. Tilmann RablProf. Dr. Tilmann Rablhttps://www.tele-task.de/lecture/video/8309/Fri, 02 Oct 2020 09:15:00 GMT