Spark machine learning & deep learninghoondong kim
油
Spark Machine Learning and Deep Learning Deep Dive.
Scenarios that use Spark hybrid with other data analytics tools (MS R on Spark, Tensorflow(keras) with Spark, Scikit-learn with Spark, etc)
Spark machine learning & deep learninghoondong kim
油
Spark Machine Learning and Deep Learning Deep Dive.
Scenarios that use Spark hybrid with other data analytics tools (MS R on Spark, Tensorflow(keras) with Spark, Scikit-learn with Spark, etc)
Golang Project Guide from A to Z: From Feature Development to Enterprise Appl...Kyuhyun Byun
油
This comprehensive presentation offers a deep dive into Go language development methodologies, covering projects of all scales. Whether you're working on a small prototype or a large-scale enterprise application, this guide provides valuable insights and best practices.
Key topics covered:
Distinguishing between small and large projects in Go
Code patterns for small, feature-focused projects
Comparison of Handler and HandlerFunc approaches
Enterprise application design using Domain Driven Design (DDD)
Detailed explanations of architectural layers: Presenter, Handler, Usecase, Service, Repository, and Recorder
NoSQL (DynamoDB) modeling techniques
Writing effective test code and using mocking tools like 'counterfeiter'
Essential tools for production-ready applications: APM, error monitoring, metric collection, and logging services
This presentation is ideal for Go developers of all levels, from beginners looking to structure their first projects to experienced developers aiming to optimize large-scale applications. It provides practical advice on code organization, testing strategies, and operational considerations to help you build robust, maintainable Go applications.
Whether you're starting a new project or looking to improve an existing one, this guide offers valuable insights into Go development best practices across different project scales and complexities.
12. Buffer vs. Aggregator
∬概旧
GroupBy, CoGroup 蟆郁骸 伎襷
Aggregator Buffer 蠍磯蓋 Output Selector Fields.ALL
谿伎
Aggregator chained讌襷 Buffer chained讌
Buffer group 螳 蟆郁骸 tuple 豢 螳
Buffer Aggregator襯 螳 蟲 朱襦 Aggregator Buffer 豪 豕 覲
pipe = new GroupBy(pipe, new Fields("mdn"), new Fields("log_time")); pipe = new Every(pipe, new Count(new Fields("count"))); pipe = new Every(pipe, new Fields("mdn"), new DistinctCount(new Fields("unique_mdn_cnt"))); pipe = new Every(pipe, new Fields("pay_amt"), new Sum(new Fields("sum"), long.class)); pipe = new Every(pipe, new Fields("log_time"), new Last(new Fields("last_time")));
16. String inPath = args[ 0 ]; String outPath = args[ 1 ]; Properties properties = new Properties(); AppProps.setApplicationJarClass( properties, Main.class ); HadoopFlowConnector flowConnector = new HadoopFlowConnector( properties ); Tap inTap = new Hfs( new TextDelimited( true, "t" ), inPath ); Tap outTap = new Hfs( new TextDelimited( true, "t" ), outPath ); Pipe copyPipe = new Pipe( "copy" ); FlowDef flowDef = FlowDef.flowDef() .addSource( copyPipe, inTap ) .addTailSink( copyPipe, outTap ); flowConnector.connect( flowDef ).complete();
https://github.com/Cascading/Impatient/blob/master/part1/src/main/java/impatient/Main.java
p.29 1.2 豐螳 貅れ企 襴貅伎
17. Tap docTap = new Hfs( new TextDelimited( true, "t" ), docPath ); Tap wcTap = new Hfs( new TextDelimited( true, "t" ), wcPath ); Fields token = new Fields( "token" ); Fields text = new Fields( "text" ); RegexSplitGenerator splitter = new RegexSplitGenerator( token, "[ [](),.]" ); Pipe docPipe = new Each( "token", text, splitter, Fields.RESULTS ); Pipe wcPipe = new Pipe( "wc", docPipe ); wcPipe = new GroupBy( wcPipe, token ); wcPipe = new Every( wcPipe, Fields.ALL, new Count(), Fields.ALL ); FlowDef flowDef = FlowDef.flowDef().setName( "wc" ) .addSource( docPipe, docTap ).addTailSink( wcPipe, wcTap );
https://github.com/Cascading/Impatient/blob/master/part2/src/main/java/impatient/Main.java
p.37 1.5 瑚鍵
18. Fields token = new Fields( "token" ); Fields text = new Fields( "text" ); RegexSplitGenerator splitter = new RegexSplitGenerator( token, "[ [](),.]" ); Fields fieldSelector = new Fields( "doc_id", "token" ); Pipe docPipe = new Each( "token", text, splitter, fieldSelector ); Fields scrubArguments = new Fields( "doc_id", "token" ); docPipe = new Each( docPipe, scrubArguments, new ScrubFunction( scrubArguments ), Fields.RESULTS ); Pipe wcPipe = new Pipe( "wc", docPipe ); wcPipe = new Retain( wcPipe, token ); wcPipe = new GroupBy( wcPipe, token ); wcPipe = new Every( wcPipe, Fields.ALL, new Count(), Fields.ALL ); FlowDef flowDef = FlowDef.flowDef().setName( "wc" ) .addSource( docPipe, docTap ).addTailSink( wcPipe, wcTap ); Flow wcFlow = flowConnector.connect( flowDef ); wcFlow.writeDOT( "dot/wc.dot" ); wcFlow.complete();
https://github.com/Cascading/Impatient/blob/master/part3/src/main/java/impatient/Main.java
doc_id text doc01 A rain shadow is a dry area on the lee back doc02 This sinking, dry air produces a rain shadow, doc03 A rain shadow is an area of dry land that lies
p.55 2.2 る蠍
19. public class ScrubFunction extends BaseOperation implements Function { public ScrubFunction( Fields fieldDeclaration ) { super( 2, fieldDeclaration ); } public void operate( FlowProcess flowProcess, FunctionCall functionCall ) { TupleEntry argument = functionCall.getArguments(); String doc_id = argument.getString( 0 ); String token = scrubText( argument.getString( 1 ) ); if( token.length() > 0 ) { Tuple result = new Tuple(); result.add( doc_id ); result.add( token ); functionCall.getOutputCollector().add( result ); } } public String scrubText( String text ) { return text.trim().toLowerCase(); } }
https://github.com/Cascading/Impatient/blob/master/part3/src/main/java/impatient/ScrubFunction.java
p.49 2.1 一