Machine learning has proved a powerful tool for artificial intelligence problems. However, its success has usually relied on having a good feature representation of the data, and having a poor representation can severely limit the performance of learning algorithms. These feature representations are often hand-designed and require significant amounts of domain knowledge and human labor. To address these issues, there has been much interest in algorithms that learn feature hierarchies from unlabeled and labeled data. In this talk, I will discuss the fundamental challenges and present my research on developing algorithms that can learn invariant and task-relevant representations from unlabeled and labeled data, with applications to visual recognition and speech recognition tasks.