[1] P. V. Coveney, E. R. Dougherty, and R. R. Highfield, Big data need big theory too, Phil. Trans. R. Soc. A 374, 20160153 (2016).[2] G. Carleo, I. Cirac, K. Cranmer, L. Daudet, M. Schuld, N. Tishby, L. Vogt-Maranto, and L. Zdeborová, Machine learning and the physical sciences, Rev. Mod. Phys. 91, 045002 (2019).[3] V. Dunjko and H. J. Briegel, Machine learning & artificial intelligence in the quantum domain: A review of recent progress, Rep. Prog. Phys. 81, 074001 (2018).[4] S. D. Sarma, D.-L. Deng, and L.-M. Duan, Machine learning meets quantum physics, Phys. Today 72, No. 3, 48 (2019).[5] W. J. Murdoch, C. Singh, K. Kumbier, R. Abbasi-Asl, and B. Yu, Definitions, methods, and applications in inter- pretable machine learning, Proc. Natl. Acad. Sci. U.S.A. 116, 22071 (2019).[6] J. Preskill, Quantum information and physics: Some future directions, J. Mod. Opt. 47, 127 (2000).[7] D. J. C. MacKay, Information Theory, Inference, and Learning Algorithms (Cambridge University Press, Cambridge, England, 2002).[8] M. Mezard and A. Montanari, Information, Physics, and Computation (Oxford University Press, Inc., New York, USA, 2009).[9] C. E. Shannon, A mathematical theory of communication, Bell Syst. Tech. J. 27, 379 (1948).[10] Y. Blau and T. Michaeli, Rethinking Lossy compression: The rate-distortion-perception tradeoff, in Proceedings of the 36th International Conference on Machine Learning, ICML 2019, Proceedings of Machine Learning Research (PMLR, 2019), pp. 675–685, Vol. 97.[11] N. Tishby, F. C. Pereira, and W. Bialek, The information bottleneck method, Proceedings of the 37th Allerton Conference on Communication, Control and Computation (Univ. of Illinois, Illinois, 2001), Vol. 49.[12] A. A. Alemi, I. Fischer, J. V. Dillon, and K. Murphy, Deep variational information bottleneck, arXiv:1612.00410.[13] K. G. Wilson and J. Kogut, The renormalization group and the ε expansion, Phys. Rep. 12, 75 (1974).[14] K. G. Wilson, The renormalization group: Critical pheno- mena and the Kondo problem, Rev. Mod. Phys. 47, 773 (1975).[15] M. E. Fisher, Renormalization group theory: Its basis and formulation in statistical physics, Rev. Mod. Phys. 70, 653 (1998).[16] A. A. Belavin, A. M. Polyakov, and A. B. Zamolodchikov, Infinite conformal symmetry of critical fluctuations in two dimensions, J. Stat. Phys. 34, 763 (1984).[17] A. A. Belavin, A. M. Polyakov, and A. B. Zamolodchikov, infinite conformal symmetry in two-dimensional quantum field theory, Nucl. Phys. B241, 333 (1984).[18] D. Friedan, Z. Qiu, and S. Shenker, Conformal Invariance, Unitarity, and Critical Exponents in Two Dimensions, Phys. Rev. Lett. 52, 1575 (1984).[19] P. Di Francesco, P. Mathieu, and D. Snchal, Conformal Field Theory, Graduate Texts in Contemporary Physics (Springer, New York, NY, 1997).[20] J.L Cardy, Scaling and Renormalization in Statistical Physics, Cambridge Lectrue Notes in Physics (Cambridge University Press, Cambridge, 1996).[21] C. Itzykson, H. Saleur, and J.-B. Zuber, Conformal Invari- ance and Applications to Statistical Mechanics (World Scientific, Singapore, 1998).[22] D. Poland, S. Rychkov, and A. Vichi, The conformal bootstrap: Theory, numerical techniques, and applications, Rev. Mod. Phys. 91, 015002 (2019).[23] A.B. Zamolodchikov, Irreversibility of the flux of the renormalization group in a 2D field theory, JETP Lett. 43, 730 (1986).[24] J. Gaite and D. O’Connor, Field theory entropy, the h theorem, and the renormalization group, Phys. Rev. D 54, 5163 (1996).[25] H. Casini and M. Huerta, A c-theorem for entanglement entropy, J. Phys. A 40, 7031 (2007).[26] S. M. Apenko, Information theory and renormalization group flows, Physica (Amsterdam) 391A, 62 (2012).[27] B. B. Machta, R. Chachra, M. K. Transtrum, and J. P. Sethna, Parameter space compression underlies emergent theories and predictive models, Science 342, 604 (2013).[28] V. Balasubramanian, J.J. Heckman, and A. Maloney, Relative entropy and proximity of quantum field theories, J. High Energy Phys. 05 (2015) 104.[29] C. Be ́ny and T. J. Osborne, The renormalization group via statistical inference, New J. Phys. 17, 083005 (2015).[30] C. Be ́ny and T. J. Osborne, Information-geometric approach to the renormalization group, Phys. Rev. A 92, 022330 (2015).[31] C. Be ́ny, Coarse-grained distinguishability of field interactions, Quantum 2, 67 (2018).[32] M.I. Belghazi, A. Baratin, S. Rajeshwar, S. Ozair, Y. Bengio, A. Courville, and D. Hjelm, Mutual Information Neural Estimation (PMLR, 2018), pp. 531–540, arXiv: 1801.04062.[33] B. Poole, S. Ozair, A. Van Den Oord, A. Alemi, and G. Tucker, On Variational Bounds of Mutual Information (PMLR, 2019), pp. 5171–5180, arXiv:1905.06922.[34] D. Efe Gokmen, Z. Ringel, S.D. Huber, and M. Koch-Janusz, Statistical physics through the lens of real- space mutual information, arXiv:2101.11633.[35] D. E. Gokmen, Z. Ringel, S. D. Huber, and M. Koch-Janusz, Phase diagrams with real-space mutual information neural estimation, arXiv:2103.16887.[36] M. Koch-Janusz and Z. Ringel, Mutual information, neural networks and the renormalization group, Nat. Phys. 14, 578 (2018).[37] P. M. Lenggenhager, D. E. Gokmen, Z. Ringel, S. D. Huber, and M. Koch-Janusz, Optimal renormalization group trans- formation from information theory, Phys. Rev. X 10, 011037 (2020).[38] S. Hassanpour, D. Wuebben, and A. Dekorsy, Overview and investigation of algorithms for the information Bottleneck method, in Proceedings of the 11th International ITG Conference on Systems, Communications and Coding, SCC 2017 (VDE (Verlag GmbH), Berlin, 2017), pp. 1–6.[39] A. E. Parker, T. Gedeon, and A. G. Dimitrov, Annealing and the rate distortion problem, in Proceedings of the 15th International Conference on Neural Information Process- ing Systems, NIPS02 (MIT Press, Cambridge, MA, USA, 2002), pp. 993–976.[40] T. Gedeon, A. E. Parker, and A. .G Dimitrov, The mathematical structure of information Bottleneck methods, Entropy 14, 456 (2012).[41] G. Chechik, A. Globerson, N. Tishby, and Y. Weiss, Information Bottleneck for Gaussian variables, in Advances in Neural Information Processing Systems, edited by S. Thrun, L. K. Saul, and B. Scholkopf (MIT Press, Cambridge, 2004), Vol. 16, pp. 1213–1220.[42] H. A. Kramers and G. H. Wannier, Statistics of the two- dimensional ferromagnet. Part I, Phys. Rev. 60, 252 (1941).[43] L. Onsager, Crystal statistics. I. A two-dimensional model with an order-disorder transition, Phys. Rev. 65, 117 (1944).[44] P. Nightingale, Finite-size scaling and phenomenological renormalization, J. Appl. Phys. 53, 7927 (1982).[45] B. Derrida and L. De Seze, Application of the phenom- enological renormalization to percolation and lattice animals in dimension 2, J. Phys. 43, 475 (1982).[46] J. L. Cardy, Conformal invariance and universality in finitesize scaling, J. Phys. A 17, L385 (1984).[47] J. L. Cardy, Operator content of two-dimensional conformally invariant theories, Nucl. Phys. B270, 186 (1986).[48] See the Supplemental Material at http://link.aps.org/ supplemental/10.1103/PhysRevLett.126.240601 for further discussion and details, which includes [49–58].[49] E. Schneidman, N. Slonim, N. Tishby, R. R. de Ruyter van Steveninck, and W. Bialek, Analyzing neural codes using the information Bottleneck method (2001), http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.29.3263.[50] F. Creutzig and H. Sprekeler, Predictive coding and the slowness principle: An information-theoretic approach, Neural Comput. 20, 1026 (2008).[51] L. Buesing and W. Maass, A spiking neuron as information Bottleneck, Neural Comput. 22, 1961 (2010).[52] N. Slonim and N. Tishby, Document clustering using word clusters via the information Bottleneck method, in Proceed- ings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’00 (Association for Computing Machinery, New York, 2000), pp. 208–215.[53] S. Still, W. Bialek, and L. Bottou, Geometric clustering using the information Bottleneck method, in Advances in Neural Information Processing Systems, edited by S. Thrun, L. Saul, and B. Scholkopf (MIT Press, Cambridge, 2004), Vol. 16, pp. 1165–1172.[54] D. J. Strouse and D. J. Schwab, The information Bottleneck and geometric clustering, Neural Comput. 31, 596 (2019).[55] F. Creutzig, A. Globerson, and N. Tishby, Past-future information bottleneck in dynamical systems, Phys. Rev. E 79, 041925 (2009).[56] S. Still, Information Bottleneck approach to predictive inference, Entropy 16, 968 (2014).[57] S. Agmon, E. Benger, O. Ordentlich, and N. Tishby, Critical slowing down near topological transitions in rate-distortion problems, arXiv:2103.02646.[58] N. Slonim, The information Bottleneck: Theory and applica- tions, Ph.D. thesis, The Hebrew University of Jerusalem, 2002.[59] A. Banerjee and Z. Ringel, Information bottleneck and Gaussian field theory (to be published).[60] J. Zinn-Justin, Quantum Field Theory and Critical Phenomena, International Series of Monographs on Physics (Clarendon Press, Oxford, 1989).[61] C. N. Yang and S. C. Zhang, SO4 symmetry in a Hubbard model, Mod. Phys. Lett. B 04, 759 (1990).[62] T. Senthil, A. Vishwanath, L. Balents, S. Sachdev, and M. P. A. Fisher, Deconfined quantum critical points, Science 303, 1490 (2004).[63] R. Bondesan and A. Lamacraft, Learning symmetries of classical integrable systems, arXiv:1906.04645.[64] A. Nir, E. Sela, R. Beck, and Y. Bar-Sinai, Machine-learning iterative calculation of entropy for physical systems, Proc. Natl. Acad. Sci. U.S.A. 117, 30234 (2020).