The complex intrinsic and extrinsic forces from the body and environment push the brain into nonequilibrium. The arrow of time, central to thermodynamics in physics, is a hallmark of non-equilibrium and serves to distinguish between reversible and non-reversible dynamics in any system. Here, we use a deep learning Temporal Evolution NETwork (TENET) framework to discover the asymmetry in the flow of events, ‘arrow of time’, in human brain signals, which provides a quantification of how the brain is driven by the interplay of the environment and internal processes. Specifically, we show in large-scale HCP neuroimaging data from a thousand participants that the levels of nonreversibility/non-equilibrium change across time and cognitive state with higher levels during tasks than when resting. The level of non-equilibrium also differentiates brain activity during the seven different cognitive tasks. Furthermore, using the large-scale UCLA neuroimaging dataset of 265 participants, we show that the TENET framework can distinguish with high specificity and sensitivity resting state in control and different neuropsychiatric diseases (schizophrenia, bipolar disorders and ADHD) with higher levels of non-equilibrium found in health. Overall, the present thermodynamicsbased machine learning framework provides vital new insights into the fundamental tenets of brain dynamics for orchestrating the interactions between behaviour and brain in complex environments.