Skip to main content

Synthetic Data Generation on The Hype Train

By Stefan Keller, Reik Leiterer, Nicolas Lenz

When it comes to predictions, we should always be careful. But Synthetic Data Generation is certainly one of the trend topics. Gartner is not the only one to say that this technology will become established in the next few years.

On April 13, 2022, the Expert Group “Spatial Data Analytics” met in Zurich on the topic of “Geospatial Synthetic Data” in-person and virtually. In the modern premises of the Gleisarena, provided by the FFHS, the host, Aldo Lamberti of, presented three top-class talks to the 20 participants. The following is a brief summary.

The Spatial Data Analytics Expert Group is a nice place to share ideas. It’s part of the data innovation alliance which is instrumental in making Switzerland a recognized hub for data-driven value creation.

Left to right: Jakob Dambon, Aldo Lamberti, Josef Boesze

Aldo Lamberti began with a presentation on “How to securely collaborate and compute on synthetic geo data”. Syntheticus envisions a world in which the full potential of data is realized, while at the same time preserving fundamental privacy rights. Synthetic data is the solution. Synthetic data mimics real data while preserving the utility of data and protecting privacy – it is poised to revolutionize the way the world realizes the full potential of data. Public and private entities around the world trust us to unlock and monetize untapped data without violating compliance. They are setting new standards by securely collaborating and processing Syntheticus data across the entire data value chain. Syntheticus provides an SaaS platform for enterprises to generate synthetic data at scale while maintaining privacy.

Jakob Dambon of SwissRe spoke on “Spatial and Spatio-Temporal Statistics, Using Both Frequency and Bayesian Approaches.” He explained that one of the best known regression methods is Ordinary Least Squares (OLS). It is easy to model and interpret. However, when dealing with spatial data, the model assumptions are usually no longer valid. More specifically, observations that are close together are more dependent than observations that are far apart. This is where geostatistical methods come into play. These methods attempted to explain the remaining dependencies using, for example, a Gaussian process. These processes capture the dependence on the observations over distance in their covariance function. Finally, using geostatistical methods, he modeled the covariates as a fixed external trend while allowing the intercept to vary over space.

Josef Boesze of itopia ag spoke about “Developing and Testing without any Risks or Side Effects using iSynth”. itopia – as a boutique IT consulting firm for the financial world – has long suspected that testing based on production data – even when anonymized – leads to risks and undesirable side effects. Moreover, machine learning and Big Data analytics have become the natural enemies of solutions based on anonymized data. In his opinion, it’s time for a change. The alternative is synthetic data. However, until now, generating synthetic data was too costly, the results were not satisfactory, or it was simply not practical. Efficient and risk-free development, testing and training is now possible thanks to consistent synthetic test data. itopia offers an agile and object-oriented approach as well as suitable test data factory tools for projects and DevOps.

After a lively discussion, the participants present went out for pizza together at a nearby casual industrial-style venue. The food and drinks were kindly sponsored by ExoLabs. While networking, the next host was also already determined. This means that we can look forward to more interesting meetings!

Service Lunch Smart Services: Transformation of the service business of Swiss industrial companies

With Boris Ricken, AWK Group

COVID-19 has posed enormous challenges to Swiss industrial companies over the past two years. The service sector has been particularly hard hit, as it relies on personal interactions with customers. At the same time, digital technologies have changed the service business.

In his presentation, Boris Ricken shed light on the implications of these developments for Swiss industrial companies. He showed how longterm trends in service provision have been reinforced, e.g., by local service provision in combination with central products and services. Given this, different fields of action were elaboarated, among others for new digital services and business models.

The presentation was accompanied by lively discussion and input from the participants. The expert group Smart Services is a very active platform for sharing and growing knowledge and expertise in this field.

Contact person: Jürg Meierhofer

Databooster – To support SMEs

HEPIA, HES-SO, OPI, and NTN Innovation Booster Databooster join their forces to support SMEs on their way from a rough idea to a funded research project. On 1st March 2022, a joint event was organized at HEPIA in which 30 interested persons from the industry took part.

After the welcome of OPI (OPI – Office de Promotion des Industries et des Technologies) by Hélène Gache (Directrice at OPI) and HEPIA by Claire Baribaud (Directrice at HEPIA – Haute école du paysage, d’ingénierie et d’architecture de Genève – HEPIA) Nabil Abdennadher (Professor of Computer Science at HEPIA) presented the Databooster objectives and innovation process for the audience. He pointed out that the NTN Innovation Booster will support the preliminary phase of open innovation before it comes to an innovation project.

Two success stories of the last year were presented by SMEs.

First, Andreas Seonbuchner (CEO and partner of CitizenTalk) showcased his journey within the Databooster – starting from first idea discussions with a research group to securing appropriate team partners by a call for participation to the community. An interdisciplinary team with potential customers proceed in shaping further his idea. The vital clarity for implementation options was gained through an Innocheck for a feasibility study (together with an Applied University). Finally, a consortium was founded for an already accepted Innosuisse project.

Thereafter, Sami Jaballah, Co-Founder and CEO of DNEXT Intelligence SA, described his  success with the Databooster: with two matched partners and a solid framework that shaped his idea, he is now preparing the Innosuisse project submission. At one stage, Sami admitted being unsure about his idea. However, thanks to the competence and expertise the Databooster provided, he was able to solidify his relatively vague idea into a structured concept that reached maturation.

After these two presentations, an open discussion on various topics followed, such as the difference between Innovation Booster and Innosuisse Innocheck, confidentiality and IPR, funding model and budget allocation, the definition of innovation, etc. A delicious aperitif concluded the event.
Many thanks to all persons involved in organizing the event. We are looking forward to many Call for Participations from the attendees.

Challenges in Applied Computer Vision

By Philipp Schmid, CSEM, Andrea Dunbar, CSEM and Jakob Olbrich, PwC

Meeting of Expert Group Machine Learning Clinic, February 11 2022

What have expensive mechanical watches, sand, e-waste and cockpits in common? All areas have tough challenges in computer vision. Human eyes are very hard to outperform with cameras and image processing. What people perform with their visual sense every day is just amazing and and creating these capabilities remains a complex challenge for computer vision.

At this first in person meeting this year the expert group focused on various real world vision problems.
The event was hosted by PwC in their inspiring location in Oerlikon. Four speakers set the floor for great discussions followed by a lively sitting Apéro.

Lukas Schaupp, PwC «Detecting e-Waste»
The amount of electronic devices people dispose is growing exponentially. Not just talking about smartphones, laptops and earphones but as well larger household items like dishwashers, toasters and vacuum cleaners. As prices for raw materials are rocketing off automated recycling of e-waste is becoming attractive. Lukas demonstrated strategies to localize and classify different electronic devices in bulk on a conveyor belt.

Andrea Dunbar, CSEM «AI at the Edge – Safety in the next generation Cockpits»
There are multiple reasons and advantages to process at the edge. Andrea demonstrated this impressively in the use-case: next generation cockpits. Pilot drowsiness detection and more important high accuracy eye gaze detection (±1°) with rates of up to 60 frames per second are only possible at the edge. What is today already reality in the flight simulator will soon be introduced in each car for the safety of our roads.

Francesco Cicala, PwC «Automatic image thresholding for semantic segmentation»
The quality of concrete depends heavily on the right mixture of sand and pebbles. In the future a smartphone app should be able to classify the correct mix by assessing the size of the sand and pebbles. Francesco introduced a powerful method to extend Otsu’s thresholding technique into a locally adaptive threshold map for the whole image. This method is robust, fully explainable and there are no labels needed. In a next phase it will be extended with a U-Net algorithm to improve accuracy.

David Honzatko, CSEM «Photometric stereo in defect detection»
Swiss Made symbolizes perfect quality. Especially in the watch industry requirements are demanding. The small parts are highly reflective, complex shaped and defects can appear randomly at any position. The key to an automated defect detection solution is photometric stereo. David presented a dome setup which can project up to 108 illumination directions. To reduce the hardware requirements whilst keeping the performance David presented a new data augmentation technique, which boosts the training of any deep learning architecture processing the images.

A full evening of new insights and tough challenges in the field of computer vision. Thanks to everyone
for the great participation and especially to the host for the amazing location and the local Apéro.

Operational ML for Service Engineers: Successes and Pitfalls

By Lilach Goren Huber, Thomas Palmé, Manuel Arias Chao (all ZHAW), Maik Hadorn, Roche

Smart Maintenance Expert Group Meeting 20.01.2022

Once more, we met online for an interesting presentation followed by vivid discussions and networking. Yes, online networking!

We started by proudly introducing our new industrial Lead: Dr. Maik Hadorn, International Product Manager, from Roche Diagnostics. Welcome Maik, we are honored to profit from your expertise!

Next, Niels Uitterdijk, the CTO and founder of Amplo exposed us not only to success stories but also to challenges and pitfalls on the way to successful machine-learning-based predictive maintenance. As usual in our EG, this included concrete use case examples, this time from several different application fields.

After an intense Q&A session (we were 29 attendees!) we switched from Zoom to Wonder, where we had the chance to meet and network with group members. Similarly to previous meetings of our EG, this worked out really well!

We look forward to the next meeting – this time, finally, face to face.

No Time To Die

The organization of the 11th meeting of the Spatial Data Analytics Expert Group included some unexpected twists. After postponing the original meeting in September, we also had to switch to an online format at short notice on the new date. Although the excitement couldn’t quite compete with a real agent movie, we were at least pleased that we could finally welcome a large number of participants.

The real excitement came from the announced contents. Dr. Joachim Steinwendner from FFHS had offered to host the meeting and had prepared a program with the topic GIS and Health. The two announced talks were titled after Bond movies.They addressed the interface between GIS and Health, once from the pharmacological point of view of and once from the perspective of geoinformatics.

PD Dr. Stefan Weiler focused on the first view. In his talk “On Her Majesty’s Secret Service” he presented the role of geodata in medicine with numerous illustrations (e.g. the Corona dashboards). Joachim Steinwendner then changed the perspective in his talk “The World Is Not Enough”. He asked the audience to imagine a GIS in which the coordinate system did not map the world, but rather the human body.

The meeting ended in an informal exchange under Plans were made for future collaborations or at least for the next visit to the cinema.

Diversity: Fully Exploit the Potential of Data Science

Just around 25% of the participants of the 8th Swiss Conference on Data Science are female. This aligns with the report from the World Economic Forum that claims that women make up only an estimated 26% of workers in data and AI roles globally. But why? And how can we change that?

We talked to Christian Hopp from BFH, who has addressed academic careers, gender, and diversity in his research, and Teresa Kubacka, who works as a freelance data scientist at Litix and is a member of a community “Women in Machine Learning and Data Science Zurich”

Why do you think diversity is important in Data Science?

Christian Hopp: Very generally, why would anyone think that diversity should not be important in Data Science? Aiming for example at gender balance or the equal representation of minorities in Data Science means fully exploiting the talent pool and fairly distributing opportunities.

Software, algorithm, or more broadly technology development in general, is first and foremost an interactive process, where various actors are involved and where communication and collaboration help to combine different knowledge bases. Hence, to fully exploit the potential of Data Science, it is paramount to involve individuals from all sorts of backgrounds (gender, ethnicity, age, socio-economic background, etc.). Diversity broadens the search horizon, it may help to develop insight and products/service offerings that are more responsible, more ethically, and socially acceptable. When the process itself is more inclusive, so will the final solutions developed be. This may range from making AI applications less racist or including more female-centered views when developing algorithms. If we fail to pay attention to diversity, many of the unconscious biases that have been uncovered may find their way into algorithm development. In the end, we may end up with severely biased algorithms and less trusted by potential user groups.  Diversity in Data Science can endorse these relevant values and viewpoints already during the technology development process. Values like diversity and inclusiveness of the development teams/companies need to be front and center to ensure responsible and inclusive technology and innovation development. 

Teresa Kubacka: Data science needs diversity because we live in diverse societies.  Although we tend to think that “data is the new oil” and “data speaks for itself”, data is not part of the natural world as the oil or the force of gravity, but it is people who actively create data. People decide which data is important enough to collect and what to leave out. People decide which research question is worthwhile the effort and which projects to invest in. This is why if our goal is to create data-driven products that make sense for all the members of society, we have to include a diverse representation of society in the process of creating those products and defining what is important. We have many examples of data projects that backfired spectacularly because they have been designed and developed by a homogeneous group of people (for example a health monitoring app that doesn’t have a functionality to track menstruation). Luckily, we also have many examples of projects where inclusive data science projects led to more empowerment or have driven positive change. This holds for all kinds of diversity, not only gender diversity. Last but not least – can we afford to lose talented data scientists only because they don’t have “the right appearance (gender, skin color, etc.) for the job”? 

What’s keeping women out of Data Science?

Christian Hopp: To be honest. I sincerely do not fully know, but I wish to understand. We have done prior research in STEM fields, focussing on female academic careers. We found that gender stereotyping attributes lower field-specific ability to women. In sum, women aspiring for an academic STEM career with leadership responsibility are confronted with “double” incongruity: First, they are experts in domains that are clearly male-dominated, subjecting them to severe biases stemming from the perception of their abilities. Second, even an aspiration for leadership was still seen as atypical for women. It could be that careers in data science are fraught with problems because women have to fulfill expectations in very male-dominated environments. 

Also, interactions with colleagues and superiors played a similarly important role in academic careers in STEM fields. Oftentimes role models are important to pursue a certain career path. Especially early career stages are sensitive periods in which influential imprints may be left. A lack of prominent role models might keep women out of data science as a career choice. But to answer this more fully, we would need more empirical evidence as to how individual aspects of gender imbalance interact and co-evolve with systemic ones. Formal and informal rules, proximate social structures, organizational culture, professional networks, couple perspectives, prevailing stereotypes, and individual motives may all interact here.  

Teresa Kubacka: In my experience, it’s not the lack of interest. I meet plenty of women who are fascinated by data science and are highly competent to become good data scientists. Many obstacles that they face are the same for women in tech in general. Here I’ll focus on the ones most characteristic for data science. 

One group of obstacles is related to a stereotypical perception of who can be a good data scientist: it is a person with a formal degree in an area historically predominantly given to men (computer science, mathematics, etc.), so women are statistically more likely to be perceived as not having the “right” qualifications. This happens also because a data scientist is often perceived to be a better software engineer and there is low awareness of a big variety of different flavors of data scientist roles among recruiters as well as applicants – some roles are more product-oriented, there are many teams where a data scientist needs to be a good communicator and structured, analytical thinker in the first place, some data science roles have a strong UX and product design component. 

Another group of obstacles is more of a mundane nature, but can, unfortunately, be a real deal-breaker. One big issue is a lack of data science roles with 60% workload and a relatively small market for freelancers, which in Swiss reality means that women who have to share a large portion of family responsibilities cannot easily work as part-time data professionals. It is also not easy to make a transition into data science gradually on the job, without investing time after work into getting a certification (like a CAS or a Bootcamp). Working women with family responsibilities are particularly impacted by this because they cannot afford the time. 

The third group of obstacles comes from within the existing machine learning/data science community. Some things that used to be perceived as normal in male-dominated communities are perceived as hostile by many women. For example, until not long ago one of the most important AI conferences used to be called “NIPS” with a pre-conference event called “TITS”. Only after severe criticism, it has been hesitantly renamed to “NeurIPS” (link 1, link 2). As a woman wanting to enter the field, you start asking yourself: will I be taken seriously there if they picked an acronym for a conference after a female body part? 

The last thing that I think is also relevant is that on one side, requirements for data scientists written by some recruiters are unrealistic, but on the other side, many women don’t believe that they can apply and do the job if they fulfill only part of the requirements and learn the rest on the job. This is why it’s important to build up their confidence – for example by creating networks of female professionals who can exchange experiences, by creating inclusive environments that allow for free experimentation and learning by doing, and by engaging in different mentoring programs both as mentor and mentee. 

How can you support and push diversity forward?

Christian Hopp: Generally, I think it is important to increase awareness through communication. Organizations need to put the benefits of diversity front and center. Not only on the webpage and other communications but in the hearts and minds of people working in data science. We need to wholeheartedly embrace the notion that diversity leads to better, more inclusive, and more innovative outcomes. Second, and that being said from a middle-aged, white, male professor from a non-academic family background trying to educate the next generation of data scientists, we need to activate and communicate through role models. Career paths in data science need to become more visible for women, diverse, or minority individuals. Third, we need more mentoring for diverse, minority individuals in companies but also in academic training.

Teresa Kubacka: I can give some examples based on the activities in our community. We organize meetups and workshops aimed to support women and gender minorities in data science and machine learning. Our community members can talk about their data science projects and learn new skills in a friendly environment. We try to give an opportunity to speak to people who have different kinds of data science roles and life situations to present a variety of inspiring role models. We put a lot of emphasis on events where the participants can engage in informal coaching and at least once a year we try to organize a career event. We also support other communities for women in tech, for example by participating in the conference “WeTechTogether”, where more than 20 communities took part, and where WiMLDS, Litix, and Databooster organized a geodata workshop together. 

However, we can only do that much and there is still much more systemic change that needs to happen. There is no one single solution and every organization will have to find its own path. Some solutions come as an answer to the obstacles to diversity I described before. For example, the Swiss job market would need to open up for more part-time work in data science. Over the last few years, we have already seen an increase in 80% workload positions. So if you are a manager and have an open position for a data scientist, consider making it a 60-100% position or a job-sharing position. If you have an employee with a strong analytic skill that is inspired by data science, think if there is a way for this person to learn some data science skills within their current role. As a general rule, we are all biased and use stereotypes, so it’s always good to check your bias and privilege, and question your instinctive hiring choices because they may act against inclusiveness and diversity. If you design a product or start a project, put effort into assembling a diverse team. If you can, encourage people from historically non-privileged groups to participate in high-profile projects and give them credit and recognition for their effort. Once an organization embraces diversity as its value, and not as an option for interested participants, many of these changes will follow naturally as a consequence of this choice. 

The Art of Data Fusion

By Nicolas Lenz (Litix), Stefan Keller (OST) and Reik Leiterer (ExoLabs)

Geodata are used in various industries and academic fields and often have to meet specific requirements in order to be used, for example in terms of geometry, recording time point or semantics. But often different geodata sets have similar geometric properties but different semantics or are captured at different times – or vice versa. Accordingly, the added value arises when your data sources start to «talk to each other», connection points between the data are used or possible gaps can be filled. In this context, there is a multitude of technical terms, which are sometimes used differently depending on the subject area, sometimes are used synonymously and sometimes are used inappropriately in their terminology – so you will read about «append», «merge», «relate», «link», «connect», «join», «combine», or «fuse», just to mention a few.

In the last meeting of the Spatial Data Expert Group on the 4th of November, this topic was presented and discussed, and the challenges and potential of the concept were highlighted. This included a critical examination of the semantic classification as well as the presentation of various possible applications in research and industry. Our host was the UZH Space Hub at the University of Zurich, represented by Dr. Claudia Röösli.

Representation of individual tree characteristics based on multi-temporal airborne 3D-LiDAR data, in situ measurements, and multi-spectral satellite data. Fuses data – or not?

So, what is Data Fusion – with a strong focus on spatial data? For some, it means more a list of different data sets, with a narrative relating one data set to the next. For others, it means visualizing different data sources on the same graph to spot trends, dynamics, or relations. In the spatial domain, the basic concept of data fusion is often the extraction of the best-fit geometry data as well as the most suitable semantic data and acquisition times from existing datasets.

The keynote was given by Dr. André Bruggmann, Co-CEO, Data Scientist and Geospatial Solutions Expert at Crosswind GmbH. Under the motto “Unlock the Where.”, he presented how data fusion techniques help customers gain new insights, from (spatial) visualizations and web applications to facilitate strategic business decisions (e.g., selection of optimal point of sale locations). In addition, he presented a project where data fusion techniques are applied to make detailed and future-oriented statements about the assertiveness of e-mobility and identify relevant trends for the automotive industry.

Dr. André Bruggmann from Crosswind – “Unlock the Where”

These inputs led to an exciting discussion between the experts present – not only on the technical implementations presented, but also regarding the potential for optimisation and possible future cooperation. This is exactly how the initiators of the event had envisioned it – an open and inspiring exchange in line with the basic idea of open innovation.

Are you also interested in spatial data and its applications? Then come to the next expert group meeting on 15th of December on the topic of GIS and Health, hosted by Dr. Joachim Steinwendner from FFHS.

Smart Service Innovation for Adapting to the Pandemic Situation – Successful Smart Services Summit 2021

This image has an empty alt attribute; its file name is Summit21-1.jpg

By Jürg Meierhofer

On October 22, the expert group Smart Services welcomed worldwide top experts to the fourth Smart Services Summit. The focus was on how Smart Services allow firms to adapt in the COVID-19 pandemic. Examples of remote and collaborative working have created new forms of co-delivery where customers are integrated into the service processes. Such a change requires a mindset change for more traditional firms as the service model migrates from ‘do it for you’ to ‘do it yourself’ or some mix of ‘do it together’. Considering service science, the switch makes perfect sense as it means that the full set of resources within the ecosystem are now being used rather than only a part. Services can be delivered faster and at lower costs with the support of new technologies and when working with the customer in a co-delivery mode. The changes are leading to new value propositions and business models today and will lead to an evolution in Smart Services in the future. The changes themselves must be understood, and we may need to consider new or different implementation and delivery models for Smart Services. These new working approaches may also requite use to re-evaluate both training and education.

Across the papers and presentations, it became apparent that digital service innovation has substantially changed and accelerated since the start of the pandemic. Customer needs and service processes have undergone dramatic disruption, which is still ongoing. A common thread throughout all the papers was the concept of the ecosystem thinking, which was discussed from a wide field of perspectives and in a comprehensive way. In line with the concept of the Service-Dominant Logic, the needs of the different actors in the ecosystem need to be identified and integrated into the design of the services and the integration of the various resources in the ecosystem. The ecosystem perspective not only integrates the different human actors, but also technological, digital resources.

Innovation through intensive collaboration allows to switch different perspectives and innovation approaches. This results in seamless value propositions and solutions for the beneficiary actors, which is a necessary prerequisite for economic value creation. Well-designed service experiences based on a consequentially customer-centric view and approach are thus at the basis of value creation.

This transition to digital service innovation in ecosystems requires not only fundamental changes of the technological platforms. In particular, collaboration across actors, organizations, and industry requires a new level of trust, culture, skills, marketing approaches and innovation frameworks.

Many thanks to all those who spoke at, and attended, the Smart Service Summit. A big thinks to IBM, data innovation alliance, ZHAW Zurich University of Applied Sciences and Lucerne University of Applied Sciences and Arts for supporting the event.

data innovation alliance at the AI+X Summit

The ETH AI Center celebrated its first birthday on October 15, 2021, at the AI+X Summit and the data innovation alliance was there to congratulate and to join the inspiring crowd. The day started with workshops.

David Sturzenegger and Stefan Deml from Decentriq organized one of the workshops on “Privacy-preserving analytics and ML” in the name of the alliance.

It was our first in-person workshop again, and such a great experience for us. We gave an overview of various privacy-enhancing technologies (PETs) to a very engaged and diverse audience of about 30 people. We had in-depth discussions about the use-cases that PETs could unlock, and also presented about Decentriq’s data clean rooms and our use of confidential computing. Our product certainly generated a lot of follow up interest, especially from those who wanted to reach out to demo the platform. We were also joined by a guest speaker from Hewlett Packard who spoke about “Swarm Learning”.

David Sturzenegger, Stefan Deml

Melanie Geiger from the data innovation alliance office attended the workshop about AI + Industry & Manufacturing led by Olga Fink from ETH. The overall goal of the workshop was to identify the next research topics. Small groups with representatives from manufacturing companies mixed with researchers discussed the challenges and opportunities of predictive maintenance, quality control, optimization and computer vision. We identified research topics such as more generalizable predictive maintenance methods that work for multiple machines or even multiple manufacturing companies. But we also realized that some challenges are more on the operational side or applied research like in the integration of the method into the whole manufacturing process and closing the feedback loop.

In the evening the exhibition and the program on the main stage attracted 1000 participants. We had many interesting discussions at our booth with a wonderful mix of students, entrepreneurs, researchers, and people from the industry. Of course, we also saw many familiar faces and due to the 3G policy, we got back some “normality”.