A scheme of the cleaning and analysis pipeline of high throughput Ig gene sequences. The process starts with raw data reads from high throughput sequencing (right column). Each rectangle in the right column represents an independent module or program that is used in the analysis pipeline, and thus can be skipped through. The Automation program is represented in the right column as a single step, and the lines coming out of the “Automation” rectangles lead to the left column which presents the Automation program flow, which is constructed of several steps as detailed in the manuscript. The Automation program receives as input a file of sequences, and creates metadata and alignment files for each clone for further analyses. The arrows in the right column represent the recommended flow of the cleaning and analysis process as created and done by the authors. The arrows in the left column represent the flow of the Automation program, as discussed in the manuscript.