Differences

This shows you the differences between two versions of the page.

Link to this comparison view

projects:meerkat:compest [2016/05/23 14:11]
wucknitz created
projects:meerkat:compest [2016/12/13 12:17] (current)
134.104.31.74
Line 26: Line 26:
  
 Summing up the stations now "only" takes 4*N_a*N_b*bw floating-point operations per second (factor 4 is for two polarisations and complex beams). This is typically 90 TFLOPS, but these can be combined with the phase corrections using fused multiply/accumulate operations. Summing up the stations now "only" takes 4*N_a*N_b*bw floating-point operations per second (factor 4 is for two polarisations and complex beams). This is typically 90 TFLOPS, but these can be combined with the phase corrections using fused multiply/accumulate operations.
- 
  
 ==== Incoherent sums ==== ==== Incoherent sums ====
Line 35: Line 34:
 ==== Averaging ==== ==== Averaging ====
  
-The beams can be averaged in time and/or frequency, depending on the setup. In time this does not require too many additional operations, because the additions can be combined with the former ones. In frequency all the data basically have to be added again, which means 4*N_a_N_b*bw = 90 TFLOPS more operations (factor 4 for two pols and complex).+The beams can be averaged in time and/or frequency, depending on the setup. In principle this does not require too many additional operations, because the additions can be combined with the former ones. Otherwise we basically have to added all data again, which means 4*N_a_N_b*bw = 90 TFLOPS more operations (factor 4 for two pols and complex).
  
  
 ==== Beam-forming total ==== ==== Beam-forming total ====
  
-For the phase correction and coherent and incoherent sums, and assuming that fused operations can be used, we will need a total of 270 TFLOPS. With additional averaging in frequency we have 360 TFLOPs.+For the phase correction and coherent and incoherent sums, and assuming that fused operations can be used, we will need a total of 270 TFLOPS. With additional averaging (if it cannot be combined with previous operations in an efficient way) we have at most 360 TFLOPs.
  
  
Line 61: Line 60:
 With L=1 (8) km baselines and D=13.5m dishes, we need of the order (L/D)^2 = 5500 (350 000) beams to cover the primary beam. The N_b=400 beams will thus not cover the entire area. With L=1 (8) km baselines and D=13.5m dishes, we need of the order (L/D)^2 = 5500 (350 000) beams to cover the primary beam. The N_b=400 beams will thus not cover the entire area.
  
-We can form additional beams from visibilities. For this we assume that we have only one polarisation product (typically Stokes I) and a decimation factor in time/frequency of N_d. Per (decimated) sample, each beam requires the addition of N_a^2/2 visibilities after applying phase factors (4 MUL and 2 ADD). This means 3*N_a^2*N_b*bw/N_d operations. With a decimation of N_d=10 and N_b=400 beams, this means 430 TFLOPS, quite comparable to the direct beam-forming. But now we can trade resolution in time/frequency for number of beams. Going from 50 microsec time resolution to 500 microsec boosts the number of beams to 4000.+We can form additional beams from visibilities. For this we assume that we have only one polarisation product (typically Stokes I) and a decimation factor in time/frequency of N_d. Per (decimated) sample, each beam requires the addition of N_a^2/2 visibilities after applying phase factors (4 MUL and 2 ADD). This means 3*N_a^2*N_b*bw/N_d operations. With a decimation of N_d=10 and N_b=400 beams, this means 430 TFLOPS, not so much more than the direct beam-forming. But now we can trade resolution in time/frequency for number of beams. Going from 50 microsec time resolution to 500 microsec boosts the number of beams to 4000. At higher frequencies (e.g. for GC searches) we can average more in frequency without losing any science. 
 +This sounds realistic, provided we can search all these beams. But they do not have to be searched in real-time. Still, they may have to go into the switch again, which may limit us. 
 + 
 +Also we must not forget that decimation in time and/or frequency reduces ouf field of view because of bandwidth and time-averaging smearing!
  
 An FFT can be used for this "imaging" step, but this makes everything much more complicated, because visibilities would have to be gridded, which is probably not very efficient on GPUs. An FFT can be used for this "imaging" step, but this makes everything much more complicated, because visibilities would have to be gridded, which is probably not very efficient on GPUs.
  
  
 
projects/meerkat/compest.1464005487.txt.gz ยท Last modified: 2016/05/23 14:11 by wucknitz     Back to top