Close printable page

Recommendation

A model petting zoo for interacting with network structure

Leto Peel based on reviews by 2 anonymous reviewers

A recommendation of:

Structify-Net: Random Graph generation with controlled size and customized structure

Remy Cazabet, Salvatore Citraro, Giulio Rossetti (2023), arXiv, ver.2, peer-reviewed and recommended by PCI Network Science https://doi.org/10.48550/arXiv.2306.05274

Read preprint in preprint server Now published in Peer Community Journal

Codes used in this study

Scripts used to obtain or analyze results

Abstract

EN

AR

ES

FR

HI

JA

PT

RU

ZH-CN

Structify-Net: Random Graph generation with controlled size and customized structure

Network structure is often considered one of the most important features of a network, and various models exist to generate graphs having one of the most studied types of structures, such as blocks/communities or spatial structures. In this article, we introduce a framework for the generation of random graphs with a controlled size -- number of nodes, edges -- and a customizable structure, beyond blocks and spatial ones, based on node-pair rank and a tunable probability function allowing to control the amount of randomness. We introduce a structure zoo -- a collection of original network structures -- and conduct experiments on the small-world properties of networks generated by those structures. Finally, we introduce an implementation as a Python library named Structify-net.

Network Generation, Random Graphs, Network Structure, Python Library

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

Structify-Net: إنشاء رسم بياني عشوائي بحجم يمكن التحكم فيه وبنية مخصصة

غالبًا ما تعتبر بنية الشبكة واحدة من أهم ميزات الشبكة، وتوجد نماذج مختلفة لإنشاء رسوم بيانية تحتوي على أحد أكثر أنواع الهياكل التي تمت دراستها، مثل الكتل/المجتمعات أو الهياكل المكانية. في هذه المقالة، نقدم إطارًا لإنشاء الرسوم البيانية العشوائية ذات الحجم المتحكم فيه - عدد العقد والحواف - وبنية قابلة للتخصيص، تتجاوز الكتل والمكانية، استنادًا إلى رتبة زوج العقد ووظيفة الاحتمالية القابلة للضبط مما يسمح للتحكم في كمية العشوائية. لقد قدمنا حديقة حيوانات هيكلية - مجموعة من هياكل الشبكة الأصلية - ونجري تجارب على خصائص العالم الصغير للشبكات التي تولدها تلك الهياكل. وأخيرًا، نقدم تطبيقًا كمكتبة بايثون باسم Structify-net.

إنشاء الشبكات، الرسوم البيانية العشوائية، بنية الشبكة، مكتبة بايثون

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

Structify-Net: generación de gráficos aleatorios con tamaño controlado y estructura personalizada

La estructura de la red a menudo se considera una de las características más importantes de una red, y existen varios modelos para generar gráficos que tienen uno de los tipos de estructuras más estudiados, como bloques/comunidades o estructuras espaciales. En este artículo, presentamos un marco para la generación de gráficos aleatorios con un tamaño controlado (número de nodos, aristas) y una estructura personalizable, más allá de bloques y espaciales, basada en el rango de pares de nodos y una función de probabilidad ajustable que permite para controlar la cantidad de aleatoriedad. Presentamos un zoológico de estructuras (una colección de estructuras de redes originales) y llevamos a cabo experimentos sobre las propiedades de mundo pequeño de las redes generadas por esas estructuras. Finalmente, presentamos una implementación como una biblioteca de Python llamada Structify-net.

Generación de red, gráficos aleatorios, estructura de red, biblioteca Python

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

Structify-Net : génération de graphiques aléatoires avec taille contrôlée et structure personnalisée

La structure du réseau est souvent considérée comme l'une des caractéristiques les plus importantes d'un réseau, et divers modèles existent pour générer des graphiques présentant l'un des types de structures les plus étudiés, tels que des blocs/communautés ou des structures spatiales. Dans cet article, nous introduisons un cadre pour la génération de graphes aléatoires avec une taille contrôlée (nombre de nœuds, arêtes) et une structure personnalisable, au-delà des blocs et des graphiques spatiaux, basée sur le rang des paires de nœuds et une fonction de probabilité réglable permettant pour contrôler la quantité de hasard. Nous introduisons un zoo de structures - une collection de structures de réseaux originales - et menons des expériences sur les propriétés du petit monde des réseaux générés par ces structures. Enfin, nous introduisons une implémentation sous forme de bibliothèque Python nommée Structify-net.

Génération de réseau, graphiques aléatoires, structure de réseau, bibliothèque Python

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

स्ट्रक्चरीफाई-नेट: नियंत्रित आकार और अनुकूलित संरचना के साथ यादृच्छिक ग्राफ पीढ़ी

नेटवर्क संरचना को अक्सर नेटवर्क की सबसे महत्वपूर्ण विशेषताओं में से एक माना जाता है, और सबसे अधिक अध्ययन किए गए प्रकार की संरचनाओं में से एक, जैसे ब्लॉक/समुदाय या स्थानिक संरचना वाले ग्राफ उत्पन्न करने के लिए विभिन्न मॉडल मौजूद हैं। इस लेख में, हम एक नियंत्रित आकार - नोड्स, किनारों की संख्या - और एक अनुकूलन योग्य संरचना, ब्लॉक और स्थानिक से परे, नोड-जोड़ी रैंक और एक ट्यून करने योग्य संभाव्यता फ़ंक्शन के आधार पर यादृच्छिक ग्राफ़ की पीढ़ी के लिए एक रूपरेखा प्रस्तुत करते हैं। यादृच्छिकता की मात्रा को नियंत्रित करने के लिए. हम एक संरचना चिड़ियाघर का परिचय देते हैं - मूल नेटवर्क संरचनाओं का एक संग्रह - और उन संरचनाओं द्वारा उत्पन्न नेटवर्क के छोटे-विश्व गुणों पर प्रयोग करते हैं। अंत में, हम स्ट्रक्चरिफ़ाई-नेट नामक पायथन लाइब्रेरी के रूप में एक कार्यान्वयन पेश करते हैं।

नेटवर्क जनरेशन, रैंडम ग्राफ़, नेटवर्क संरचना, पायथन लाइब्रेरी

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

Structify-Net: 制御されたサイズとカスタマイズされた構造によるランダムグラフの生成

ネットワーク構造は、ネットワークの最も重要な特徴の 1 つと考えられることが多く、ブロック/コミュニティや空間構造など、最も研究されているタイプの構造の 1 つを持つグラフを生成するためのさまざまなモデルが存在します。この記事では、ノードペアのランクと調整可能な確率関数に基づいて、制御されたサイズ (ノード、エッジの数) と、ブロックや空間構造を超えたカスタマイズ可能な構造を持つランダムグラフを生成するためのフレームワークを紹介します。ランダム性の量を制御します。独自のネットワーク構造を集めた構造動物園を導入し、その構造が生成するネットワークのスモールワールド性について実験を行います。最後に、Structify-net という名前の Python ライブラリとしての実装を紹介します。

ネットワーク生成、ランダムグラフ、ネットワーク構造、Pythonライブラリ

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

Structify-Net: Geração de gráfico aleatório com tamanho controlado e estrutura customizada

A estrutura da rede é frequentemente considerada uma das características mais importantes de uma rede, e existem vários modelos para gerar grafos com um dos tipos de estruturas mais estudados, como blocos/comunidades ou estruturas espaciais. Neste artigo, apresentamos um framework para geração de grafos aleatórios com tamanho controlado - número de nós, arestas - e uma estrutura customizável, além de blocos e espaciais, baseada em classificação de pares de nós e uma função de probabilidade ajustável permitindo para controlar a quantidade de aleatoriedade. Introduzimos um zoológico de estruturas - uma coleção de estruturas de rede originais - e conduzimos experimentos nas propriedades de mundo pequeno das redes geradas por essas estruturas. Por fim, apresentamos uma implementação como uma biblioteca Python chamada Structify-net.

Geração de Rede, Gráficos Aleatórios, Estrutura de Rede, Biblioteca Python

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

Structify-Net: создание случайных графиков с контролируемым размером и настраиваемой структурой.

Структура сети часто считается одной из наиболее важных характеристик сети, и существуют различные модели для создания графов, имеющих один из наиболее изученных типов структур, таких как блоки/сообщества или пространственные структуры. В этой статье мы представляем структуру для генерации случайных графов с контролируемым размером (количеством узлов, ребер) и настраиваемой структурой, помимо блоков и пространственных, на основе ранга пары узлов и настраиваемой функции вероятности, позволяющей контролировать количество случайности. Мы представляем зоопарк структур — коллекцию оригинальных сетевых структур — и проводим эксперименты по изучению свойств маленького мира сетей, порожденных этими структурами. Наконец, мы представляем реализацию в виде библиотеки Python под названием Structify-net.

Генерация сети, случайные графики, структура сети, библиотека Python

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

Structify-Net：具有受控大小和定制结构的随机图生成

网络结构通常被认为是网络最重要的特征之一，并且存在各种模型来生成具有研究最多的结构类型之一的图，例如块/社区或空间结构。在本文中，我们介绍了一个用于生成随机图的框架，该框架具有受控大小（节点数、边数）以及可定制的结构（超出块和空间结构），基于节点对等级和可调概率函数，允许来控制随机性的量。我们引入了一个结构动物园——原始网络结构的集合——并对这些结构生成的网络的小世界属性进行实验。最后，我们引入一个名为 Structify-net 的 Python 库实现。

网络生成、随机图、网络结构、Python 库

Submission: posted 09 June 2023, validated 09 June 2023
Recommendation: posted 20 September 2023, validated 20 September 2023

Cite this recommendation as:
Peel, L. (2023) A model petting zoo for interacting with network structure. Peer Community in Network Science, 100114. https://doi.org/10.24072/pci.networksci.100114

Recommendation

If you work, study or play in network science then chances are you have generated a network. Whether or not you have a real-world system to analyse, synthetic networks play an important role in network science. Generating networks of a chosen size can provide a null model for a statistical test, a test bed for new algorithms or the basis for studying the interplay between structure and dynamics in complex systems. Consequently network science literature contains a wide array of network models: some designed as processes to replicate observed properties and others for the purposes of statistical inference. However, these models have different parameters and constraints associated with their generative models, may or may not have the ability to control for random noise and do not always have readily available software implementations, thus making them unavailable to network science practitioners.

The article of Cazabet et al. (2023) introduces a software "zoo, " called Structify-Net, that contains a range of models that the authors have captured from the wild. The authors have focused on developing a framework that enables the generation networks of a chosen size, according to number of nodes and edges, and provides the means to control for randomness, by interpolating between the specified structure and a random graph. The article also discusses an interesting use case to examine the interplay between network structure and node attributes, which might compliment methods based on permutation tests (Bianconi et al. 2009, Ehrhardt and Wolfe 2019).

Structify-Net presents some interesting future opportunities. For instance, the independence that Structify-Net imposes on edge ranking (defined by the model) and the expected number of edges (defined by the user) might offer a route towards exploring network growth or evolution. Like any zoo Structify-Net is not complete in that there are many more exotic "species" that the authors, or perhaps others in the network science community, may later collect. Collecting more model implementations to align with reviews of network models (Goldenberg et al. 2010) together with methods of statistical inference has the potential to lay the foundations for the ever important bridge between theory and practice in network science (Peel et al. 2022).

References

Bianconi, Ginestra, Paolo Pin, and Matteo Marsili (2009) Assessing the Relevance of Node Features for Network Structure. Proceedings of the National Academy of Sciences 106, 28: 11433–38. https://doi.org/10.1073/pnas.0811511106

Cazabet, Remy, Salvatore Citraro, and Giulio Rossetti (2023) Structify-Net: Random Graph Generation with Controlled Size and Customized Structure. arXiv, ver. 2 peer-reviewed and recommended by Peer Community in Network Science. https://doi.org/10.48550/arXiv.2306.05274

Ehrhardt, Beate, and Patrick J. Wolfe (2019) Network Modularity in the Presence of Covariates’. SIAM Review 61, 2: 261–76. https://doi.org/10.1137/17M1111528

Goldenberg, Anna, Alice X. Zheng, Stephen E. Fienberg, and Edoardo M. Airoldi (2010) A Survey of Statistical Network Models. Foundations and Trends in Machine Learning 2, 2: 129–233. https://doi.org/10.1561/2200000005

Peel, Leto, Tiago P. Peixoto, and Manlio De Domenico (2022) Statistical Inference Links Data and Theory in Network Science. Nature Communications 13, 1: 6794. https://doi.org/10.1038/s41467-022-34267-9

PDF recommendation

Conflict of interest:
The recommender in charge of the evaluation of the article and the reviewers declared that they have no conflict of interest (as defined in the code of conduct of PCI) with the authors or with the content of the article. The authors declared that they comply with the PCI rule of having no financial conflicts of interest in relation to the content of the article.

Funding:
This work is supported by the European Union – Horizon 2020 Program under the scheme “INFRAIA-01- 2018-2019 – Integrating Activities for Advanced Communities”, Grant Agreement n.871042, “SoBigData++: Eu- ropean Integrated Infrastructure for Social Mining and Big Data Analytics” (http://www.sobigdata.eu). This project was partly founded by BITUNAM grant ANR-18-CE23-0004.

Reviews

Reviewed by anonymous reviewer 1, 06 Sep 2023

I think the authors have thoughtfully addressed my comments, and I don't have any other revisions I would recommend before publication.

https://doi.org/10.24072/pci.networksci.100114.rev21

Reviewed by anonymous reviewer 2, 06 Sep 2023

I thank the authors for their careful consideration of my previous comments and their work on improving the manuscript. I find the purpose of the paper much clearer and the link to other works better explained. I believe the paper could be published as it is, although I still have two minor suggestions.

1. I would personnally prefer putting the description of related works before the introduction of the new framework. As a reader, I always find it easier to know what has been done before to better understand the value of something new.

2. The authors now explain well how their package is different from other netwrok generator packages. Maybe it would be good to also mention why the package is useful for researchers who think of making their own code directly (how would the functions built in the package make their life easier? Are there benefits to having everyone using the same software? etc.)

I wish the authors good luck with this work!

https://doi.org/10.24072/pci.networksci.100114.rev22

Evaluation round #1

DOI or URL of the preprint: https://arxiv.org/abs/2306.05274

Version of the preprint: 1

Author's Reply, 06 Sep 2023

Download author's reply https://doi.org/10.24072/pci.networksci.100114.ar1

Decision by Leto Peel, posted 29 Jun 2023, validated 30 Jun 2023

I think this preprint shows potential and would like to recommend a revised version of it. The current version falls a bit short. The reviewers make some very nice suggestions to this end. In particular, I'd like to emphasize the point about connecting better with previous literature. It seems there are a number of models that could be considered as part of the current version (e.g., see Peter Hoff's work on latent space models), so not pointing this out seems like a missed opportunity. Providing a summary of features/benefits over other packages that involve network generation would also help promote the proposed software.

https://doi.org/10.24072/pci.networksci.100114.d1

Reviewed by anonymous reviewer 1, 16 Jun 2023

In this manuscript the authors develop a method for generating networks with a given number of nodes (n) and number of edges (m), which have planted structure according to a ranking of the node pairs and a non-decreasing probability function mapping these ranks to edge existence probabilities. The authors construct a parameterization for their probability function which allows one to interpolate between the extremes of a completely random graph (equal probability for all edge pairs, epsilon = 1) and a completely deterministic graph given the rankings (probability 1 for the m highest ranking node pairs and 0 elsewhere, epsilon = 0). They then construct a collection ("structure zoo") of different ranking matrices that can be used along with their tunable probability function to generate networks with a wide variety of planted structures and any desired level of noise. They conduct experiments to examine the effect of changing epsilon on the small-worldness of the generated graphs, using different rank matrices from the structure zoo, in a manner reminiscent of the original small-world experiment of Watts and Strogatz. The authors package their method and structure zoo into an easy-to-use Python package and make their code for the article available as well.

For the most part, the method and experiments are clearly presented and easy to follow. However, I do think the authors could beef up their literature review and motivational examples in order to better place their method in context with previous work. I also think it would be nice if the authors could clarify how their method can be used in conjunction with real data given its limitations for statistical inference.

A few more specific points along these lines:

(1) The proposed method has many similarities to a graphon model, and I think the paper could be improved with some discussion of the graphon literature to demonstrate why the proposed method should be preferred. Perhaps one advantage of the proposed method is that it has a more easily controlled noise level?

(2) On a similar note, the graphs being generated are specific instances of inhomogeneous random graphs, so it could be worth discussing that literature further to identify any similar previous work and how the proposed method improves upon it.

(3) I think I generally understand the authors' use of the rational Bezier curve as a means of interpolating between the two extremes of complete equality and inequality in the distribution of the probabilities over the edge pairs while fixing m. But how did they decide on this family of curves for interpolation? Is it the most "natural" in some sense? Or does its derivative have a particularly nice form? I'm curious to know because I hadn't heard of these functions before reading the paper.

(4) In the motivation section, the authors highlight diffusion as a particular process for which understanding the role of network structure is important, but there are many other example applications they could mention (e.g. synchronization, percolation). By singling out diffusion it appears like the proposed method has a more limited set of applications than it does, and I think the motivation would be strengthened by adding some additional examples.

(5) The authors do a good job of pointing out the limitations of their method at the end of the paper, in particular the scalability and the independence assumption they make. I would also add here the challenges resulting from the poor compatibility of this method with parameter estimation techniques, necessitating the comparison of ad hoc summary statistics as discussed in the last paragraph.

(6) I think it would be interesting to see how the detectability of community structure varies with epsilon for block-structured rank matrices. Does this have a clean phase transition like in the standard detectability setting? Not a suggestion for publication, just a thought for future studies.

(7) A very minor point that I found confusing was that the authors said epsilon "controls how strongly the random graph is driven by the community structure." I think they should change the wording to something like "planted structure" or "rank structure", since their method can represent much more than community structure.

(8) Another very minor point for presentation is that I found the in-text citation format a bit confusing. There should be brackets or parentheses or something around the in-text citations to separate them from the other text.

I enjoyed reading this paper, and I hope the authors find my suggestions helpful.

https://doi.org/10.24072/pci.networksci.100114.rev11

Reviewed by anonymous reviewer 2, 20 Jun 2023

This article proposes a general framework for the generation of random networks along with the corresponding Python package. The main contribution of the paper is to spell out a definition of a random graph that can be further specified to reproduce well-known models (SBMs, small-world networks, etc.). While the idea of building an umbrella framework for the bestiary of random graphs we know seems good, there are several problems that should be adressed before publication.

Major comments:

- The theoretical contribution of the paper is not clear enough. The paper does not explain the intuition behind the general generative process that it proposes (the two steps on page 3), and most importantly what the rank function may represent or may be interpreted. Right now, it is not clear why one would use this framework instead of directly using specific and already-known models.

- Linked to this, it is not clear what the Structify-net package adds to the existing software tools we already have. In particular, the graph generator functions in NetworkX seem to already do a fine job in helping users to generate the graphs they are interested in. The authors should show what their tool compares to already existing solutions and why it is a useful addition.

- The application on replicating the Watts-Strogatz experiment does not seem to be the best way to demonstrate the value of the framework, because it is very specific. A better way could be to compare the results of the graph generator to several classic graph generators (either theoretically, looking at graph characteristics that are produced, such as clustering, diameter, etc., or comparing software capabilities, such as computing time).

Minor comments:

- Page 2, the structure of the section on "Motivational examples" is very short and discusses somewhat random problems. The authors may want to develop better this section to explain better cases where random graphs are useful, such that they can use these cases later in the paper (the examples in the "Structure Zoo" section go in all directions and it might be useful for the reader to have a bit of background before hand).

- Page 2, there are many fields where the Erdös-Rényi model is definitely not the most commonly used. Maybe the most known, or elementary?

- Page 4, "P(r) = p, p in [0,1]" reads as: the probability function is a constant p.

- Page 4, what is the rationale behind using Bézier curves (accent missing)? What do the endpoints and control point represent?

- The sub-sections on block structures page 6 are straightforward and could be shortened.

- In comparison, the sections on star structure and core-preiphery page 7 could be better developed. Are the formulations for the rank functions proposed the only ones possible? How were they derived?

- Why are fractal models useful? (see first minor comment)

- Page 8, "hierarchical structure" in networks has been defined in many ways and by many other papers, and hierarchy in networks is a vibrant topic in social network analysis...

- Check typos: page 2,"heterogeneityBarthélémy", page 4, "Section describes"...

https://doi.org/10.24072/pci.networksci.100114.rev12