1 Chapter 1: Planning a Mobile Phone Data Initiative
This chapter introduces the foundational considerations for planning a Mobile Phone Data (MPD) initiative, setting the stage for more detailed technical, institutional, and operational guidance in subsequent chapters. As mobile phones have become nearly ubiquitous across diverse socioeconomic contexts, the digital traces they generate offer unprecedented opportunities to inform public policy, development planning, and humanitarian action (Rowe 2022; Rowe et al. 2023). Harnessing this potential, however, requires careful planning, strong governance, and a clear understanding of both the opportunities and constraints associated with MPD.
The chapter begins by defining what is meant by MPD and outlining the principal ways in which it can be used to generate insights on population dynamics, mobility patterns, service access, and behavioural trends. It then examines the motivations for undertaking an MPD initiative, including the policy and operational gaps such initiatives can address, as well as the comparative advantages of MPD relative to more traditional data sources. Building on this foundation, the chapter identifies the key technical, legal, institutional, and organisational requirements that must be in place to ensure an MPD initiative is effective, ethical, and fit for purpose.
Recognising that MPD initiatives are inherently multi-actor endeavours, the chapter also discusses the critical stakeholders who must be engaged throughout the planning process, clarifying their respective roles, responsibilities, and incentives. It further highlights the principal risks associated with MPD initiatives, such as privacy concerns, data misuse, capacity constraints, and sustainability challenges, and outlines how these risks can be anticipated and mitigated through proactive planning. The chapter concludes by presenting approaches for assessing a country’s readiness to implement MPD initiatives and by introducing key design principles to support long-term sustainability, institutionalisation, and impact. Together, these elements provide a structured framework for decision-makers and practitioners seeking to responsibly and effectively integrate MPD into their data ecosystems.
1.1 Understanding the Basics of MPD
1.1.1 What Is MPD?
MPD refers to digital traces generated through the operation and use of mobile communication devices (Rowe and González-Leonardo 2024). These traces are created as mobile phones interact either with mobile network infrastructure or with software applications installed on the device. Across all forms, MPD has one defining characteristic: it can be used to approximate the geographic position of a device, and by extension its user, over time. This makes it particularly valuable for analysing patterns of human mobility and population dynamics (Gonzalez et al. 2008; Song et al. 2010; Blondel et al. 2015).
There are two broad categories of MPD:
- Mobile Network Operator (MNO) data, which is generated within the telecommunications network itself as part of routine service provision, such as CDRs and signalling data; and
- GPS-based data, which is generated by smartphones and collected by applications when users opt in.
This training manual focuses exclusively on the former, and more specifically on CDRs, because this is the data type most commonly available at national scale and most frequently used in official statistics and public policy applications (United Nations Statistics Division 2019; Ricciato et al. 2020; Salgado et al. 2021).
1.1.2 MNO data
MNO data refers to data that is generated and stored by the telecommunications companies who operate the infrastructure such as cell tower networks which power mobile phone communications. These companies collect and store data about the operation of their systems for different reasons and purposes. What data they collect and store depends to some extent on various factors including the particular company’s internal policies, operating procedures, licensing requirements and available infrastructure (especially for storing large amounts of big data). We discuss here two main types of MNO data: CDRs and signalling data.
CDRs
CDRs are generated as part of the process for billing customers. They are transaction logs created by MNOs whenever a subscriber uses their network services. These services include making or receiving a voice call, sending or receiving an SMS, or using their phone for access to the internet such as when downloading emails, searching websites or engaging with social media (this type of event is called a mobile data session). CDRs are generated passively and automatically; no additional action is required from the user beyond normal phone usage. This passive generation is one of the benefits of CDRs - they don’t incur the costs and delays associated with field data collection.
From a technical perspective, a standard CDR contains a limited but highly structured set of variables. These typically include a subscriber identifier, a cell tower or cell sector identifier indicating the network element that handled the communication, a timestamp marking when the event occurred, and a code describing the type of network event. Additional fields may include an identifier for the receiving party or technical routing information, depending on the operator and the use case. It is important when third parties are accessing the CDR data to ensure that any subscriber identifiers have been Pseudonymised (see section 5.1.2 for details).
It is critically important to understand what CDRs do not contain. They never include the content of calls, messages, or internet activity. Analysts cannot see what was said, written, or accessed online. This distinction is central to both ethical communication and legal compliance, and it should be clearly articulated to stakeholders and the public. For more on how to communicate about MPD initiatives see Chapter 7.
Signalling Data
Signalling data (sometimes called network signalling, probe data, or cellular signalling traffic) is the stream of technical “keep the network working” events generated by continuous communication between a handset and the mobile network. These events are not only generated when a person makes a call/SMS or starts a data session, but they also triggered by mobility and location-management mechanisms, for example as a subscriber moves between cell towers the system handovers that occur as the device moves between cells will generate signals, so will the “attach or detach” activity necessitated by a phone powering on or off. In practice, the records look similar to CDR data but the dataset is much bigger because the event universe is much larger than billing-oriented datasets due to the fact that it includes operational signalling generated by the network itself and by idle devices, not just data produced when subscribers actively use their phones. This is both a benefit (more data is available) and a practical challenge, because signalling datasets can be enormous, making data storage, access governance, and processing materially harder than is the case for CDR pipelines.
Mobile Phone GPS-derived Data
GPS-derived MPD refers to location information captured directly by the global positioning system (GPS) sensors embedded in smartphones and other mobile devices, typically via apps that have permission to record and share location (Barreras and Watts 2024). Unlike the network-generated datasets described above, GPS data are collected from a device’s onboard navigation chipset and can provide latitude/longitude coordinates, with high geographical precision (often within a few metres) and temporal frequency (e.g., seconds to minutes). They can collect continuous data on devices anywhere in the globe, thus offering global data coverage. However, the geographical precision and temporal frequency of data may vary depending on how the app is configured, users engage with the app, user consent is managed, and the type of technology used to build the device collecting the data.
Overall, GPS-derived smartphone datasets are a powerful complement to network-generated data in mobility analytics. They trade population coverage for precision, detail and geographical coverage in the recorded spatial trajectories. The geographical coverage enables more easily capturing cross-national movements. The use of CDR data to capture such moves is more challenging as people may switch SIM cards, mobile phone devices or / and operators, preventing the recording of an interrupted sequence of device’s locations. Thus, the choice between GPS data, or their integration with CDR data, depends on the data and analytical specification requirements of the task at hand, such as whether fine-grained individual movement paths or broad population flow patterns are more central to the application.
1.1.3 The spatial and temporal characteristics of CDRs vs Signalling and GPS data
The spatio-temporal characteristics of CDR Data
The spatial resolution of CDR data is determined by the mobile network infrastructure rather than the device itself. Location is inferred from the cell site providing the service. This acts as a proxy for the user’s position, capturing the interaction between the mobile network infrastructure and geographic position of devices. In dense urban environments, cell towers may cover relatively small areas, resulting in finer spatial granularity. In rural or remote areas, a single tower may cover a larger area, leading to coarser location estimates. Planners must account for this variability when assessing whether CDRs are suitable for a particular analytical purpose and application (Blondel et al. 2015; Ricciato et al. 2020).
The frequency of events may affect the temporal resolution of CDR data. Data are generated based on the occurrence of events reflecting user behaviour, rather than being a continuous data stream per se (other forms of higher resolution data from MNOs include what may be called signalling or ping data). In CDRs, a user’s location can only be estimated when the user actively uses the network. As a result, the temporal density of CDRs can vary widely across individuals and contexts, influenced by factors such as phone ownership, usage patterns, socioeconomic status and network pricing. This intermittency introduces analytical challenges that must be addressed through appropriate statistical methods, data integration and careful interpretation (Blondel et al. 2015; Wesolowski et al. 2013; Ricciato et al. 2020).
Comparison of CDRs to signalling and GPS-derived data
Relative to CDRs, the key difference in signalling data is sampling frequency and who gets observed. CDRs are typically generated when a billable telecommunications event occurs (call/SMS/data session), so people who use their phones infrequently can appear “missing” for long stretches of time. Signalling data can produce observations every few seconds/minutes, while the phone is switched on, including for passive/idle devices, which often yields much higher temporal granularity and better continuity for mobility mapping; especially for inferring trips, dwell times and flows for low-usage subscribers.
Alternatively, GPS data tends to have a much higher spatial precision and geographical coverage (as opposed to population coverage) than network-generated data. That is because the location information originates from satellite signals interpreted by the phone’s operating system and applications. However, GPS datasets can be skewed by aspects such as user opt-in, app usage patterns and battery-saving behaviours, since data collection typically requires explicit permission and can be turned off or throttled by users or the OS. Additionally, GPS datasets may provide greater geographical coverage offering an opportunity to capture cross-national movements, but they generally provide more limited population coverage than CDR data. A single MNO may cover over half of the mobile phone user population, whereas GPS data from a single application, such as AirBnB may represent a much smaller share of mobile phone users, particularly seeking accommodation.
Additionally, GPS data may be susceptible to additional sources of biases than CDR and signalling data. GPS data are generated primarily from smartphones reducing the potential sample size of data collection and recording from a selective, likely wealthier segment of the population. This may be particularly true in low-income economies. By contrast, CDR and signalling data cover both smartphones and feature phones, often achieving far higher population coverage. In many low- and middle-income contexts, and depending on the use case, the broader coverage of CDRs can outweigh the lower precision available through such datasets, particularly for national-level analysis and policy monitoring. Duplication may thus create additional representativeness biases in GPS data. GPS data may also contain duplicate information generated from the same device but captured through two or more applications, creating further distortions in the data.
In summary: Compared to GPS app data, CDRs and signalling data usually have lower spatial precision (often cell/sector-level rather than meter-level) and geographical coverage but they can have broader population coverage because it is network-side rather than opt-in/app-instrumented. In contrast, GPS tends to be higher resolution and more geographically precise potentially at the expense of smaller population coverage and a larger number of sources of bias.
1.2 Clarifying the Value and Purpose of Using MPD
1.2.1 Strategic Value and Use Cases
The growing interest in MPD stems from its ability to complement traditional data sources such as censuses, household surveys, and administrative records. These traditional sources are indispensable but often expensive, infrequent, and slow to update. MPD, by contrast, is generated continuously and can provide near real-time insights. (Blondel et al. 2015) Typical motivations for launching an MPD initiative include:
- Addressing temporal or spatial gaps in existing statistics
- Enhancing the timeliness of policy-relevant indicators
- Supporting rapid decision-making during crises or shocks (Rowe et al. 2023)
- Improving understanding of population mobility and service access
Each of these motivations carries different technical and governance implications. For example, crisis response applications may prioritise speed and automation, while official statistics may emphasise methodological rigor and reproducibility. Different policy and statistical applications are presented in more detail in Chapter 2.
1.2.2 The Importance of a Clear Purpose
A recurring lesson from past initiatives is the central importance of clearly defining the purpose of data access from the outset. MPD is powerful, but it is not universally appropriate. Planners should explicitly ask whether the policy or analytical question at hand genuinely requires MPD, or whether it could be answered using simpler, less sensitive, or less costly data sources.
A clearly articulated purpose guides decisions about which variables are needed, how frequently data should be updated, which stakeholders must be involved, and what level of investment is justified. Without this clarity, initiatives risk becoming technically complex without delivering commensurate public value.
1.2.3 Developing a Theory of Change
Developing a Theory of Change can be helpful when planning MPD initiatives. The idea of Theory of Change is it provides a practical planning framework for articulating how the activities being undertaken are expected to lead to meaningful policy and development impacts. Rather than relying on implicit assumptions or linear thinking, the idea is to encourage those planning an MPD project or programme to map out the full causal pathway from what they do to what they want to achieve, making explicit the links between activities, outputs, outcomes, and impact.
Box 1: Developing a Theory of Change: a Tool for Planning based on Purpose In the context of MPD initiatives, producing a Theory of Change is a particularly valuable undertaking because these initiatives operate within complex data ecosystems involving multiple stakeholders, technical systems, legal constraints, and institutional incentives. The framework can help initiative designers move beyond the assumption that simply accessing MPD will automatically improve decision-making, and instead examine what must happen at each stage for data to be used effectively and responsibly.
A typical Theory of Change begins by clarifying the intended impact, such as improved evidence-based policymaking, sustainable integration of MPD into national data systems, responsible data use, or enhanced ability to monitor and anticipate policy challenges. From there, it works backward to identify the outcomes that must be in place for that impact to occur, the outputs required to achieve those outcomes, and the specific inputs and activities needed to produce those outputs.
Crucially, Theory of Change requires explicit identification of preconditions and assumptions at each step. In MPD projects these can include aspects related to: stakeholder commitment, institutional leadership, legal and regulatory feasibility, technical capacity, data quality, staff incentives, and availability of complementary data sources. Being clear about the Theory of Change also draws attention to contextual factors such as political, institutional, or environmental considerations that may enable or hinder progress. By identifying “killer assumptions” during early stages of design and planning, initiatives can adjust plans, mitigate risks, and recognise where further analysis may be needed.
For example, a seemingly simple sequence of events, such as partners signing a data-sharing agreement document in which they agree to collaborate on receiving data and conducting analysis, will be dependent on multiple underlying conditions. These include trust between partners, adequate data governance, secure and privacy-protecting data pipelines, staff skills and availability to do analysis, and the effective engagement and alignment of all relevant stakeholders. Theory of Change provides a structured way to test whether these conditions are realistically in place.
Applying Theory of Change to MPD initiatives is recommended because it can help practitioners design more robust, context-sensitive projects. It strengthens planning by clarifying causal logic, improves the likelihood that activities will lead to desired outcomes, and supports learning by making assumptions explicit and open to review.
1.3 Core Technical and Institutional Requirements
1.3.1 Data Access Arrangements
Access to high-quality CDR data is the foundational requirement of any MPD initiative. In practice, this access is secured through formal agreements with MNOs or, in some contexts, telecom regulators. These agreements must specify the scope of data shared, the frequency of updates, permitted uses, retention periods, and security requirements.
From a technical standpoint, planners must ensure that the data provided includes not only CDR event logs but also the associated data required for interpretation, particularly cell tower location information. These datasets need to be accurate and up-to-date, otherwise the subsequent analytical results may be misleading or invalid (see also Chapter 5 for more information on data characteristics and quality).
1.3.2 Infrastructure and Data Processing Capacity
CDR datasets are typically large, often comprising millions or billions of records per day. Processing such volumes requires adequate computational infrastructure, including secure servers, scalable storage, and efficient data processing frameworks. Decisions must be made regarding whether this infrastructure is hosted within a government environment, provided by an MNO, or operated through a trusted third party. Software choices also matter. While many analytical tasks can be performed using open-source tools, the organisations involved must ensure that staff have the skills to use them effectively and securely.
1.3.3 Human Resources and Leadership
Technical infrastructure alone is insufficient for an MPD initiative without the involvement of skilled personnel, both technical and non-technical. MPD initiatives typically require a multidisciplinary team, including data engineers to manage ingestion and pipelines, analysts and data scientists to develop and implement the relevant indicators and models, and project managers to coordinate delivery, liaise with stakeholders, mobilise resources and monitor timelines. Furthermore, experience shows that strong, clearly designated leadership is essential. Champions who will drive forward the initiative and coordinate the relevant role players play an absolutely critical role in maintaining strategic focus, resolving institutional tensions, and ensuring that technical work remains aligned with the initiative’s ultimate objectives.
Typically the functions that should be considered for an effective initiative include:
Project management: Individuals who will serve as the overall coordinator of work plans and be the key point of contact for other stakeholders will be required, as a minimum, at both the MNO and the end-user organisation (e.g. the national statistical office, or a Ministry). These roles help to ensure that the initiative is well managed, runs smoothly and meets its objectives by enabling other technical staff to focus on their components in a coordinated fashion. Effective project managers in an MPD context must be comfortable operating across institutional boundaries, translating technical constraints into operational implications, managing risks related to timelines and data access, and ensuring that governance and approval processes are respected. They are also responsible for documenting decisions, managing scope, and establishing clear reporting lines.
Data engineering: Any MPD initiative will require the expertise of IT, infrastructure or data engineers who will handle obtaining access to the data (which can include, for instance, setting up secure VPNs), getting the data into the necessary structures and formats to enable analysis to take place. MNO staff working with billing records and network data will need to be involved. In order to effectively engage with the technical staff at the MNO, the receiving organisation will also need to have staff with knowledge and capabilities associated with such skills. Data engineers are responsible for designing and maintaining secure data transfer mechanisms, implementing anonymisation or pseudonymisation processes where required, and constructing and maintaining the data pipelines. They usually also work on ensuring that data environments meet agreed security standards, including access controls, logging, and auditability. Given the volume and velocity of MPD, experience with large-scale distributed systems and database optimisation is often required.
Data science and data analysis: Data scientists and data analysts are responsible for transforming processed MPD into meaningful statistical outputs and decision-support tools. This includes the construction of indicators, calibration of models, validation against ground truth data, and development, deployment and documentation of methodologies used. The statistical rigor of their work will be critical to the final results, so they need to be familiar with inferential statistics, machine learning techniques (where appropriate) and reproducible research practices. They will also be needed to work closely with subject-matter experts to ensure that outputs are policy-relevant and methodologically defensible. They are also responsible for implementing clear version control, transparent codebases, and replicable workflows.
Survey expertise: Where MPD is used to complement, calibrate or partially substitute traditional data sources, survey statisticians and sampling experts play a vital role. These professionals provide guidance on benchmarking MPD-derived indicators against other data such as field or phone surveys, assessing coverage bias, and designing hybrid methodologies that integrate conventional and non-traditional data sources. In many cases, they will work with analysts to develop weighting schemes, address biases inherent in mobile phone usage patterns, and assess representativeness relative to the target population. They can also support the development of validation frameworks, advise on weighting and adjustment procedures, and help interpret discrepancies between sources (Cabrera and Rowe 2025). Their involvement is particularly important in official statistics contexts, where methodological transparency and compliance with statistical quality standards are required. Survey experts also help ensure that MPD outputs are aligned with established indicator definitions and international reporting frameworks where applicable.
Legal, data protection and compliance: Given the sensitive nature of telecommunications data, legal and compliance expertise must be embedded in the initiative from its inception. Legal advisors are responsible for reviewing and drafting data sharing agreements, memoranda of understanding, and contractual provisions that define permissible use, retention periods, liability, and intellectual property arrangements. Data protection officers or privacy specialists should assess compliance with applicable data protection legislation and regulatory frameworks, including requirements related to anonymisation, purpose limitation, data minimisation, and user rights. They may also oversee data protection impact assessments and ensure that governance structures are formally documented. Close collaboration between legal, technical, and analytical teams is essential to ensure that privacy-preserving measures are technically feasible and legally robust.
Other roles: In addition to the roles specified above, it can be useful, depending on the nature of the initiative, to include in the project team staff who are able to deliver communications activities, media liaison work, and monitoring and evaluation to gather information on how well the initiative is meeting its purpose. Communications specialists can help articulate the objectives, safeguards and benefits of the initiative to both internal and external audiences, which can be particularly important where public trust considerations arise. Monitoring and evaluation professionals can design frameworks to assess effectiveness, uptake, and impact, including the identification of performance indicators and feedback mechanisms. Finally, as with any project, administrative functions such as financial expertise will also be necessary and need to be identified and assigned to support the initiative. Budget oversight, procurement management, and resource tracking are essential to maintaining operational continuity and accountability.
1.4 Legal, Regulatory, and Ethical Context
MPD initiatives operate within a complex legal landscape that often spans multiple domains, including data protection, telecommunications regulation, cybersecurity, and national security law. Compliance with these frameworks is not optional and must be addressed as an integral part of planning, not as an afterthought.
Legal analysis should clarify the lawful basis for data processing, the roles and responsibilities of data controllers and processors, and the rights of data subjects. Because legal frameworks vary significantly across jurisdictions, including regional frameworks such as the African Union Malabo Convention, expert legal advice is essential (African Union 2018). Training materials and general guidance cannot substitute for context-specific legal interpretation. Ethical considerations extend beyond legal compliance.
Even where data use is lawful, it may raise concerns related to fairness, proportionality, discrimination, or surveillance. Ethical reflection should therefore accompany legal analysis, particularly for use cases involving vulnerable populations. (GSMA 2016)
1.5 Stakeholders Ecosystem and Governance
MPD initiatives are inherently collaborative. There are a significant number of stakeholders who need to be involved, consulted and/or considered when planning and implementing an MPD initiative.
These include, for example:
- National statistical offices (NSOs): NSOs normally have a legal mandate and authority to produce a country’s official statistics (in some contexts, additional agencies may also have such mandates). NSOs are also usually enabled through a legal basis to collect personal data (statistical authority) and are expected to abide by certain legal guarantees and obligations to protect it (statistical confidentiality). They will have teams with methodological expertise for producing statistical outputs and will also often have access to complementary data sources for validation and bias adjustment of CDR data. The NSO is often an end user of an MPD initiative’s outputs, and may also be a conduit for distribution of such data to other parts of government.
- MNOs: MNOs are the data owners of CDR data. They control the underlying data and can provide access to it if they decide to do so, on the basis of certain considerations. MNOs tend to have good internal knowledge on the data source, information technology technical capacity and knowledge about the systems that produce and store the data. For ongoing, sustainable data pipelines, MNOs need to agree to provide regular access to new CDR data, including all associated data fields as well as regular updates to cell tower location data. When data gaps arise due to disruption to a data pipeline (e.g. power outages, infrastructure damage etc) they need to be able to help fill those data gaps. MNOs will seek to both protect subscriber privacy and their own commercial interests.
- Telecommunications regulator: The regulatory body responsible for oversight of MNOs and issuing licences to them already has regular contact with MNOs and should be consulted early in the process of establishing an MPD initiative in order to ensure that they have no objections to data processing. In some cases, the regulatory authority may itself have a desire to obtain insights from CDR analysis, as it may also have a development agenda. It may be that the regulator itself also has a mandate to collect data such as CDRs records. It can therefore play a number of different roles including: (a) facilitate a partnership in the initial phases; (b) processing CDRs to produce Information Society statistics; (c) stewarding data for other applications.
- Data protection authorities: Most countries now have a data protection commission, agency, bureau or similar body responsible for overseeing and regulating data protection. Engaging with this regulatory body early in the design of an MPD initiative is sometimes required by law but in any case is advisable, as it can help to ensure that public interest objectives are balanced with privacy and rights protections, thereby strengthening public trust. These authorities can also give advice and guidance on data protection measures and whether they are proportional to the risks.
In addition to the above, whose involvement is normally essential for a sustainable initiative, the following stakeholders should also be considered:
- Users: Any end-user, such as a Ministry, Department, Agency or other actor within government or beyond, should be involved at an early stage in order to effectively understand their needs, and specify their requirements. It’s also possible that their involved can provide support to the planned activities, for instance through mobilising political support for the initiative or helping to raise funds to proceed;
- Research community: The academic community can provide useful inputs to MPD initiatives by supporting development of methods, providing academic rigour and mobilising resources for research;
- Technical service providers: It can be beneficial to consult and/or engage the services of experts in the field who have international exposure, can share experiences and best practices, and provide technical assistance and solutions or products;
Each of the above stakeholders has their own particular mandate, incentive structures, and concerns they may have in relation to using MPD. Addressing these will be essential to ensure ongoing cooperation within the data ecosystem. So too will creating the necessary framework within which the relevant stakeholders will work together. This requires that effective governance be put in place, with clearly defined roles, decision-making processes, and accountability mechanisms that reflect the diversity of stakeholder interests. For more on good data governance practices, see Chapter 6.
As part of the World Bank’s Global Data Facility’s MPD for Policy programme, a Maturity Assessment Framework was developed which provides a structured yet flexible way to assess readiness, progress, impact, and sustainability of MPD initiatives for official statistics. Built around four maturity stages and three assessment areas (feasibility, impactfulness, and sustainability) the framework helps to identify strengths, gaps, and support needs across the legal, technical, organisational, ethical, and financial dimensions briefly described in this chapter. The framework is accompanied by digital and printable self-assessment tools, and is intended to support individual reflection, stakeholder dialogue, and informed planning, while allowing adaptation to local contexts. Readers are encouraged to download a copy of the tool and use it to assess the current state of play in their context as well as envision the future state desired (linking this to a Theory of Change exercise as described in Box 1).
1.6 Risk management and mitigation measures
Planners of MPD initiatives need to consider and explicitly address a range of risks which are often interconnected and can quickly undermine an initiative if not proactively managed. We focus here on the data-related risks associated with processing CDRs but planners should also consider and plan to mitigate risks such as ethical risks (including bias, exclusion, or misuse); reputational risks such as those arising from security breaches, public misunderstanding or mistrust in use of MPD for by non-MNO bodies; and legal and regulatory risks such as those associated with non-compliance with licensing conditions or breaking local laws.
1.6.2 Risk mitigation measures
Risk mitigation can be done through a combination of technical, organisational, and governance measures. These include clear staff responsibilities, regular training in data protection and cybersecurity, and application of privacy-enhancing techniques. Engaging oversight bodies, ethics committees, and civil society can further strengthen legitimacy and resilience.
It is good practice for MPD initiatives to develop and implement a robust data governance framework which incorporates strong measures to assess privacy, security and ethical risks and identify appropriate mitigations. The process of assessing and mitigating risks should be transparent and ideally also inclusive of the different stakeholders in an MPD initiative.
Some of the other mitigation strategies that can be adopted by an MPD initiative include:
- Effective use of robust data encryption tools
- Strict, granular access controls, monitoring and auditing
- Regular security updates and security audits (this is important, because risks evolve over time with changes in technology, personnel, and legal frameworks)
- Full compliance with all relevant data protection regulations.
1.7 Designing for Long-Term Sustainability
Sustainability should be considered from the earliest planning stages. Short-term pilot projects can generate valuable learning, but lasting impact requires long-term data access arrangements, scalable infrastructure, and institutionalised processes.
Key elements of sustainable design include:
- Clearly defined and enduring objectives
- Robust data governance frameworks (see also Chapter 6)
- Automated and resilient data pipelines (see also Chapter 4)
- Continuous investment in human capacity
- Ongoing stakeholder engagement and communication (see also Chapter 7)
- Integrated monitoring, evaluation, and learning systems
Financial sustainability also matters. While initial setup costs may be high, planners must account for ongoing operational expenses, including infrastructure maintenance, staff retention, training, and stakeholder engagement. Funding strategies may need to combine internal budgets, in-kind contributions, donor support, and the use of open-source tools.
1.8 Conclusion
Planning an MPD initiative is a complex, multidisciplinary undertaking. Technical feasibility, legal compliance, ethical responsibility, institutional coordination, and long-term sustainability are all equally important. By approaching planning as a structured, purpose-driven process and by investing in both people and systems, organisations can responsibly harness MPD to generate meaningful public value.