In addition to code and language reasoning, recent work shows that large language models (LLMs) can achieve promising results on causal reasoning tasks, e.g. in determining cause-effect relationships. This points to a new paradigm for addressing real-world causal tasks by integrating LLMs in the causal inference workflow, but also raises fundamental questions on causal reasoning capabilities of LLMs. That is, given that LLMs are trained on observational data, do LLMs’ capabilities correspond to causal reasoning or are simply a result of dataset memorization; and how reliable are the reasoning outputs? And if not, how can we use causal techniques to improve LLMs’ reasoning capabilities? To address these questions, this Dagstuhl Seminar would bring together experts from causal machine learning and language models to foster collaboration and identify the key research questions at the intersection of the fields.