Recent Posts

Pages: [1] 2 3 ... 10
we have updated the documentation to include a section. as we wrote the code it makes sense to us. however, we know that for someone not sure of what to do the documention is possibly incomplete. let us know.
Programming Bootcamp / Solution to Exercise 5, Day2 - Reading from Large Binary Files
« Last post by faisal on January 10, 2021, 07:58:15 PM »
I couldn't find solution to exercise5 day2 for reading large binary file in official NHERI github folder:

Will the correct solution of file3.c be provided so that I can verify my answer.

Also the file3.c gives no error when I compile on gcc 10.2.0, msys2. But it gives error on icc.

Here is my solution that works on icc without error, is it correct:

Code: [Select]
// program to read values from a file, each file a csv list of int and two double
// written: fmk
// exercise solved by: fr

#include <stdio.h>
#include <stdlib.h>

int getcountlines(char *filename);

int main(int argc, char **argv) {

  if (argc != 2) {
    fprintf(stdout, "ERROR correct usage appName inputFile\n");
    return -1;

  //count total number of lines from very beginning just in case
  //if lines exceeds maxVectorSize (100)

  int countlines = 0; //used to count lines

  // count lines
  countlines = getcountlines(argv[1]);
  if (countlines == -1){
    printf("the text file is empty\n aborting program ...\n");
    return -1;
  printf("total number of lines are: %i\n\n",countlines);
  FILE *filePtr = fopen(argv[1],"r");

  int i = 0;
  float float1, float2;
  int maxVectorSize = 100;
  double *vector1 = (double *)malloc(maxVectorSize*sizeof(double));
  double *vector2 = (double *)malloc(maxVectorSize*sizeof(double));

  int vectorSize = 0;

  while (fscanf(filePtr,"%d, %f, %f\n", &i, &float1, &float2) != EOF) {
    vector1[vectorSize] = float1;
    vector2[vectorSize] = float2;
    printf("%d, %f, %f\n",i, vector2[i], vector1[i]);
    if (vectorSize == maxVectorSize) {
      maxVectorSize = countlines;
      vector1 = (double *) realloc(vector1,maxVectorSize*sizeof(double)); // allocate again
      vector2 = (double *) realloc(vector2,maxVectorSize*sizeof(double)); // allocate again

//function to countlines
int getcountlines(char *filename)
  // count the number of lines in the file called filename                                   
  FILE *fp = fopen(filename,"r");
  int ch=0;
  int lines=0;

  if (fp == NULL){
   return -1;

 while (EOF != (fscanf(fp, "%*[^\n]"), fscanf(fp,"%*c")))
 return lines;

Regional Hazard Simulation (RDT, rWhale) / Re: rWHALE - output DL data units?
« Last post by kubee77 on January 10, 2021, 03:09:56 AM »
I am running rwhale on windows and when im using conan it tells me this. ERROR: Missing prebuilt package for 'jansson/2.11@simcenter/stable. I am following the instruction under the "Building the source code on Windows", and I am constantly running into the same errors.
I noticed that there is a recent updates saying: "Option to allow user to include their own FEM engine". May I ask how does this work and is there any documents for this? Many thanks!
Programming Bootcamp / Re: Constructor and Destructor types
« Last post by fmk on January 08, 2021, 04:48:52 PM »
a null constructor has 0 arguments, others have a number of arguments. a single class can have multiple constructors that take different types and numbers of arguments.
you can .. you only need the virtual destructors if you are envisioing a subclass
Programming Bootcamp / Re: Example for lib from cmak file
« Last post by fmk on January 08, 2021, 04:47:00 PM »
i did example in the shapes CMakeLists file,
Programming Bootcamp / Re: Some question regarding parallel programming concepts
« Last post by fmk on January 08, 2021, 04:46:26 PM »
1. not what i was refrring to the book by knuth (though that is where the term may have originated) ..  a philosophy about how to program that is widely articulated
2. bus something that communicates data between computer hw components (wires, optical fiber,..)
3. sequential is written to only use one ccore, parallel program written to use many cores
4. parts of a computation that are identified as being able to run in parallel
5. openmp provides a way of using multiple cores at same time with threads running in the same mempry space of a single process, in mpi there are no threads, the processes have seperate address spaces and communication is perfomred between them.
6. mp is between cores on a single cpu, mpi can be between a cores on a single cpu or between cores on different cpus
Programming Bootcamp / Re: Regarding recent MPI Lecture video (some quesitons part 1/2)
« Last post by fmk on January 08, 2021, 04:31:34 PM »
1. sum1 .. false sharing again in 2 .. would assume 3 will be close to 1 as letting compiler put in what it thinks best
2. no
3. if you could, but numT not known at that statment location for a compile time setup
4. not sure which file you are referring to
Programming Bootcamp / Re: Regarding recent MPI Lecture video (some quesitons part 2/2)
« Last post by fmk on January 08, 2021, 04:15:59 PM »
pad yes .. the other options do not specifially deal with it as they are controlling access to a single variable, however they are overcoming the problem

when p0 writes to a memory that is part of the cache line, that memory has to be sent to shared cache and other cores do not get access to their cache memory until the other cores memory is put out on the shared cache and can be read in

sync solves problem as the programmer explicitly tells compiler that acess to this memory must be controlled, in original it is not told and can do nothing about what happens, which is why what needs to happen ends up killing the performance

reduction is going to be implemented just like the synch, it is just a convenient feature that openmp provides the programmer with to write an operation that is very common.

you can do all this in c++, what is avail in c is avail in c++

have a look at the following:
Programming Bootcamp / Regarding recent MPI Lecture video (some quesitons part 2/2)
« Last post by faisal on January 07, 2021, 03:54:11 PM »
I can't grasp the idea of false sharing.

Q1. Are pad, sync, and reduction available choices to solve false sharing/sequential

Q2. What difference is made by pad (next slide)?
Followup: Here it appears that all calculations are happening in pad0? Is it right?

Q3. What synchronization doing (next slide)? So does sync solve problem of pad0? Second for loop
Code: [Select]
dot + = sum
is gone.

Q4. So what the reduction do? Does reduction do the job of sync and pad? So only reduction is
enough to stop false sharing/sequential consistency? And when to use reduction types?

Q5. Finally does intel mkl has parallel capabilities and can't we do multi-threading and multi-processing in cpp?
Pages: [1] 2 3 ... 10