Welcome!

We launched Wrangler just a few days ago, and are excited to have already received so many inquiries. Thanks to everyone who has shared their comments and feature requests! We'd like to briefly share some of our thoughts and respond to a few of the most common questions.

My data is BIG, can I use Wrangler? And can I export my script?
For our alpha launch, we are releasing Wrangler as a client-side JavaScript application. Obviously, this puts some heavy limitations on the size of the input data. In the future we plan to introduce a backend component that communicates with the front-end to greatly improve scalability.
In the meantime, we will soon augment the current Wrangler UI to enable export of data transformation scripts (for example, as Python or Hadoop scripts). So one strategy will be to paste a sub-sample of your data into Wrangler, and then use the resulting script on your full data set.

What are your plans for releasing the software?
At this initial stage we are simply making the web application available. By so doing, we hope to improve the tool with user feedback prior to our next release. We intend to eventually release the system as open-source software.

My data is private, can I use Wrangler?
During this initial experimental phase, we're interested in learning how Wrangler is being used, and we're hoping you will be willing to help. To that end we are logging transformation steps and their initiating interactions (column header clicks, text selection ranges). We do not transmit nor do we store your full pasted data set. However, data elements referenced within transformation steps (column names and selected ranges of text) are included in our log. In the future we plan to release the software so you can run it standalone on your own machine.

Can I learn more about data cleaning? Are there other tools to explore?
Yes! Data Wrangler builds on ideas published over ten years ago in the prescient Potter's Wheel system, as well as great work in the area of "programming by demonstration". And Wrangler is hardly the only kid on the block – you might also be interested in David Huynh and Co.'s work on Google Refine.

What else is coming down the pipeline?
We have some research surprises cooking in the laboratory. For example, more advanced visualizations to help further explore and clean the data once it has been "wrangled" into a tabular format.

Come back to this page from time to time for the lastest updates!

Thanks!
The Wrangler Team (Sean, Ravi, Phil, Andreas, Joe, and Jeff)

0 comment: