This project provides the ability to interact with DataDoodler and BankerDoodle resources on AWS, including initiating and monitoring the ETL (extract/transform/load) processes.
In a nutshell, DataDoodler is a platform for quickly ‘doodling’ data analytics. It aims to do for the analytics world what Plunker (http://plnkr.co/) has done for the world of application development.
BankerDoodle is an application built on top of the DataDoodler platform. It inherits all the analytical goodness of DataDoodler, but contains features specific to the banking industry. It is a closed/commercial product that will be the initial means to monetize the DataDoodler platform. It is conceivable that any number of applications (open or closed) will be built on top of the DataDoodler platform.
This document is intended to be a guided tour through some of my code. It will provide some insight into my skill set, coding style, and overall approach to developing apps. Scroll through this docuement to find the following examples:
fdic-sdi-manager (node-based ETL with lots of cool es6 features)
Blooming Menu Directive (angular directive, animation)
Address Verification Directive (angular directive, promise-based DOM manipulation)
I built this package for compatibility with chess.com, so contact me if you want me to use it for other sources. Everything required for this demo can be found in my chessDoodles workspace
1 2 3 4 5
# clear workspace rm(list=ls())
# load workspace load("chessDoodles.RData")
A link to a chess.com game always ends with a game ID. Here are two examples:
These games can be viewed publicly, but scraping the pgn for each requires a username and password. I think it’s easiest to embed the password into the link with the format “http://username:password@chess.com/…”.
1 2 3 4 5 6 7 8 9 10 11 12 13
# store username and password Username <- "thinkboolean" Password <- "blogChess"# counting on you not to abuse this; feel free to contact me for details
I keep track of chess positions using a data frame of 64 variables, one for each square of the chessboard. An example of an empty chessboard can be generated with the empty function:
1 2 3 4
load("chessDoodles.RData")
position <- empty(gameName = "example") position
1 2 3 4 5 6 7 8
## a8 a7 a6 a5 a4 a3 a2 a1 b8 b7 b6 b5 b4 b3 b2 b1 c8 c7 c6 c5 ## example_empty NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA ## c4 c3 c2 c1 d8 d7 d6 d5 d4 d3 d2 d1 e8 e7 e6 e5 e4 e3 e2 e1 ## example_empty NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA ## f8 f7 f6 f5 f4 f3 f2 f1 g8 g7 g6 g5 g4 g3 g2 g1 h8 h7 h6 h5 ## example_empty NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA ## h4 h3 h2 h1 ## example_empty NA NA NA NA
Note that columns are named after squares of the chessboard, and rows are named in gameName_move format. An example of a chessboard where the pieces have not been moved can be generated with the setup function (which internally calls the empty function):
1 2
position <- setup(gameName = "game1") position[,c("d1","e1","d8","e8")]
1 2 3
## d1 e1 d8 e8 ## game1_empty <NA> <NA> <NA> <NA> ## game1_zero white Queen white King black Queen black King
The position object is built row-by-row using the newPosition function.
Applications that are deployed to various physical and logical environments (AWS, Heroku, Dev, Prod, Local, Test, Debug, Language, TodayDate, etc…) will need a way to dynamically adapt to their current environment. The path to a file on AWS is definitely different than a path to a file on a disconnected development box.
A note about supplying credentials in an application -
Credentials for accessing outside resources (databases, s3 buckets, deployment services, etc…) should not be included in any source file (config or hardcoded in app) that will be checked-in to a source control system.
Config File
A smart module for handling config files is necessary. You can create your own, or just use one that has already been created and tested, like nodejs-config. npm install nodejs-config
Environment Variables
Sometimes, using environment variables is the way to go. Like when the application needs credentials data such as username and password.
The pathPrior() function is named after the term a priori. For a given input piece and square of the chessboard, it returns the path the piece can take on an empty chessboard, prior to the circumstance that arise when a game is in play.
The pawn is the only piece unable to move backwards, and its direction depends on its color.
1
pathPrior(piece = "black pawn", square = "e5")
1
## [1] "e4" "d4" "f4"
1
pathPrior(piece = "white pawn", square = "e5")
1
## [1] "e6" "d6" "f6"
It is the only piece whose color needs to be specified for pathPrior.() to work effectively. Post.() functions are conversely named after the term a posteriori. They trim each piece’s options according to the circumstances of the entire board.
They cannot be used unless a snapshot of the chessboard’s position exists for them to analyze. The snapshots exist as rows of the position data frame, which has one variable for each square of the chessboard. Each variable in position is a square of the chessboard,
## c6 g7 d7 g3 h8 a6 d2 g6 ## 000_zero <NA> black pawn black pawn <NA> black Rook <NA> white pawn <NA> ## h7 h3 ## 000_zero black pawn <NA>
The first row is the empty chessboard, with no pieces listed for any of the squares. The second row is the zero position, with pieces set at starting positions.
We can compare the mobility of a white knight on b1 before and after the pieces are set up:
1
pathPrior(piece = "knight", square = "b1")
1
## [1] "c3" "a3" "d2"
1
pathPost.(square = "b1", game_pgn = "000_zero")
1
## [1] "c3" "a3"
Because of the game_pgn input, pathPost. knows that the piece on h2 is a white knight, and that it cannot move to d2 because that space is occupied by another pawn.
Let us begin by observing the game I am currently playing. We can modify the link into the format “http://username:password@chess.com/…” and pass it to the rawToTidy() function as follows:
1 2 3 4 5 6 7 8 9 10
LinkID <- 131764454 Username <- "thinkboolean" Password <- "blogChess"# counting on you not to abuse this; feel free to contact me for details Link <- paste("http://", Username, ":", Password, "@chess.com/echess/game?id=", LinkID) print(Link)
The first move is pawn to e4. Inputing it into the newPosition() function replicates the last (or specified) row of the position frame, places the white pawn on e4, and removes it from e2.
I’m eager to have my kids help with building DataDoodler. I believe it is a great opportunity for them to learn valuable skills and be involved in building a tool that will help them in their schooling years and beyond. I needed some sort of matrix to guide me in what to teach them. The embedded chart below describes the skill path required to create the various projects in the DataDoodler platform. I am starting my kids in the 100-level courses. Of course, my oldest child, Josh, has a different contribution to make in the area of data science / R programming.