Accumulo is an open-source implementation of Google's BigTable distributed storage system. It was developed to store large amounts of structured data across commodity hardware. Accumulo allows for fast retrieval of data through its use of composite keys and indexes while also being scalable. Some key features include support for range queries, fast query speeds with the right schema, and built-in caching. The document provides an example of how tweets could be stored in Accumulo, either in a denormalized format to retrieve a user's timeline or across different tables to support different types of analyses.
This document summarizes how Scala and Hadoop are used at eBay. It discusses:
- Why Scala is used, including its functional capabilities and JVM compatibility.
- Why Hadoop is used to process eBay's petabytes of data across its large cluster.
- How Scalding, a Scala library, allows complex Hadoop jobs to be written concisely and tested effectively, improving on other frameworks like Pig and Cascading.
Code examples show how tasks like collaborative filtering, search query analysis, and Markov chains can be implemented in a readable way using Scalding.
IoT Devices Compliant with JC-STAR Using Linux as a Container OSTomohiro Saneyoshi
Security requirements for IoT devices are becoming more defined, as seen with the EU Cyber Resilience Act and Japan¨s JC-STAR.
It's common for IoT devices to run Linux as their operating system. However, adopting general-purpose Linux distributions like Ubuntu or Debian, or Yocto-based Linux, presents certain difficulties. This article outlines those difficulties.
It also, it highlights the security benefits of using a Linux-based container OS and explains how to adopt it with JC-STAR, using the "Armadillo Base OS" as an example.
9. map.pl #!/usr/bin/env perl use strict; use warnings; while (<>) { chomp; my @segments = split /\s+/; printf "%s\t%s\n", $segments[8], 1; }
10. reduce.pl #!/usr/bin/env perl use strict; use warnings; my %count; while (<>) { chomp; my ($key, $value) = split /\t/; $count{$key}++; } while (my ($key, $value) = each %count) { printf "%s\t%s\n", $key, $value; }
16. Scala による Quick sort def qsort[T <% Ordered[T]](list: List[T]): List[T] = list match { case Nil => Nil case pivot::tail => qsort(tail.filter(_ < pivot)) ::: pivot :: qsort(tail.filter(_ >= pivot)) } scala> qsort(List(2,1,3)) res1: List[Int] = List(1, 2, 3)
17. WordCount by Java public class WordCount { public static class Map extends MapReduceBase implements Mapper<LongWritable, IntWritable> { private final static IntWritable one = new IntWritable(1); private Text word = new Text(); public void map(LongWritable key, Text value, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException { String line = value.toString(); StringTokenizer tokenizer = new StringTokenizer(line); while (tokenizer.hasMoreTokens()) { word.set(tokenizer.nextToken()); output.collect(word, one); } } } public static class Reduce extends MapReduceBase implements Reducer<Text, IntWritable> { ´
18. WordCount by Scala object WordCount { class MyMap extends Mapper[LongWritable, Text, Text, IntWritable] { val one = 1 override def map(ky: LongWritable, value: Text, output: Mapper[LongWritable, Text, Text, IntWritable]#Context) = { (value split " ") foreach (output write (_, one)) } } class MyReduce extends Reducer[Text, IntWritable, Text, IntWritable] { override def reduce(key: Text, values: java.lang.Iterable[IntWritable], output: Reducer[Text, IntWritable, Text, IntWritable]#Context) = { val iter: Iterator[IntWritable] = values.iterator() val sum = iter reduceLeft ((a: Int, b: Int) => a + b) output write (key, sum) } } def main(args: Array[String]) = { ´
19. Java vs Scala Java Scala public void map(LongWritable key, Text value, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException { String line = value.toString(); StringTokenizer tokenizer = new StringTokenizer(line); while (tokenizer.hasMoreTokens()) { word.set(tokenizer.nextToken()); output.collect(word, one); } } override def map(ky: LongWritable, value: Text, output: Mapper[LongWritable, Text, Text, IntWritable]#Context) = { (value split " ") foreach (output write (_, one)) }
22. reducer class MyReduce extends Reducer[Text, IntWritable, Text, IntWritable] { override def reduce(key: Text, values: java.lang.Iterable[IntWritable], output: Reducer[Text, IntWritable, Text, IntWritable]#Context) = { val iter: Iterator[IntWritable] = values.iterator() val sum = iter reduceLeft ((a: Int, b: Int) => a + b) output write (key, sum) } }
23. main def main(args: Array[String]) = { val conf = new Configuration() val otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs() val job = new Job(conf, "word count") job setJarByClass(WordCount getClass()) job setMapperClass(classOf[WordCount.MyMap]) job setCombinerClass(classOf[WordCount.MyReduce]) job setReducerClass(classOf[WordCount.MyReduce]) job setMapOutputKeyClass(classOf[Text]) job setMapOutputValueClass(classOf[IntWritable]) job setOutputKeyClass(classOf[Text]) job setOutputValueClass(classOf[IntWritable]) FileInputFormat addInputPath(job, new Path(otherArgs(0))) FileOutputFormat setOutputPath(job, new Path(otherArgs(1))) System exit(job waitForCompletion(true) match { case true => 0 case false => 1}) }
24. HDFS 荷恬 import java.net.URI import org.apache.hadoop.fs._ import org.apache.hadoop.hdfs._ import org.apache.hadoop.conf.Configuration object Hdfs { def main(args: Array[String]) = { val conf = new Configuration() val uri = new URI("hdfs://hadoop01:9000/") val fs = new DistributedFileSystem fs.initialize(uri, conf) var status = fs.getFileStatus(new Path(args(0))) println(status.getModificationTime) } }