The combination of Deep Learning and Big Data has revolutionized language and speech technology in the last 5 years, and constitutes the state of the art in domains ranging from machine translation and question-answering to speech recognition and music generation. These models are now often so accurate, that many new useful applications are found with potentially significant impact on individuals, businesses and society. With that power and popularity, new responsibilities and questions arise: how do we ensure reliability, avoid undesirable biases, and provide insights into how a system arrives at a particular outcome? How do we leverage domain expertise and user feedback to improve the models even further? In all these issues, “interpretability” of the deep learning models is key.