This document summarizes a group project to parallelize the solving of Nonograms puzzles. The group explored several approaches to parallelization including optimizing code structure to take advantage of SIMD instructions, improving memory access, and implementing various MPI versions for distributed processing. Evaluation showed the MPI version with dynamic scheduling and asynchronous communication achieved the best speedup of around 5x on 10 cores. Further optimization of load balancing and addressing data dependencies could provide additional gains but parallelizing the inherently sequential nature of the problem proves challenging.