参考文献

1

Ameer M.S. Abdelhadi and Guy G.F. Lemieux. Modular multi-ported SRAM-based memories.
In Proceedings of the International Symposium on Field Programmable Gate Arrays (FPGA),
pages 35{44. ACM, 2014. ISBN 978-1-4503-2671-1. doi: 10.1145/2554688.2554773. URL
http://doi.acm.org/10.1145/2554688.2554773.

2

SystemC. Accellera, 2.3.2 edition, October 2017. URL http://www.accellera.org/
downloads/standards/systemc.

3

Hassan M. Ahmed, Jean-Marc Delosme, and Martin Morf. Highly concurrent computing
structures for matrix arithmetic and signal processing. IEEE Computer, 15(1):65{82, 1982.s

4

Raymond J. Andraka. Building a high performance bit-serial processor in an FPGA. In
Proceedings of Design SuperCon, volume 96, pages 1{5, 1996.

5

Oriol Arcas-Abella et al. An empirical evaluation of high-level synthesis languages and tools
for database acceleration. In Proceedings of the International Field Programmable Logic and
Applications Conference (FPL), pages 1{8. IEEE, 2014.

6

AMBA AXI and ACE Protocol Speci cation. ARM Limited, v1.0 edition, 2013. URL http://
infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ihi0022e/index.html. ARM
IHI 0022E.

7

Bryce E. Bayer. Color imaging array, July 1976. US 3971065.

8

Samuel Bayliss and George A. Constantinides. Optimizing SDRAM bandwidth for cus-
tom FPGA loop accelerators. In Proceedings of the International Symposium on Field Pro-
grammable Gate Arrays (FPGA), pages 195{204. ACM, February 2012.

9

Marcus Bednara et al. Tradeo analysis and architecture design of a hybrid hardware/soft-
ware sorter. In Proceedings of the International Conference on Application-speci c Systems,
Architectures and Processors (ASAP), pages 299{308. IEEE, 2000.

10

Vaughn Betz and Jonathan Rose. VPR: A new packing, placement and routing tool for
FPGA research. In Proceedings of the International Field Programmable Logic and Applications
Conference (FPL), pages 213{222. Springer, 1997.

11

Guy E. Blelloch. Pre x sums and their applications. Technical Report CMU-CS-90-190, School
of Computer Science, Carnegie Mellon University, November 1990.

12

Stephen Brown and Jonathan Rose. FPGA and CPLD architectures: A tutorial. IEEE Design
and Test of Computers, 13(2):42{57, 1996.

13

Andrew Canis, Jongsok Choi, Mark Aldham, Victor Zhang, Ahmed Kammoona, Jason H
Anderson, Stephen Brown, and Tomasz Czajkowski. LegUp: high-level synthesis for FPGA-
based processor/accelerator systems. In Proceedings of the International Symposium on Field
Programmable Gate Arrays (FPGA), pages 33{36. ACM, 2011.

14

Jianwen Chen, Jason Cong, Ming Yan, and Yi Zou. FPGA-accelerated 3D reconstruction using
compressive sensing. In Proceedings of the International Symposium on Field Programmable
Gate Arrays (FPGA), pages 163{166. ACM, 2012.

15

Jason Cong, Bin Liu, Stephen Neuendor er, Juanjo Noguera, Kees Vissers, and Zhiru Zhang.
High-level synthesis for fpgas: From prototyping to deployment. IEEE Transactions on
Computer-aided Design of Integrated Circuits and Systems (TCAD), 30(4):473{491, 2011.

16

James W. Cooley and John W. Tukey. An algorithm for the machine calculation of complex
fourier series. Mathematics of Computation, 19(90):297{301, 1965. ISSN 00255718. URL
http://www.jstor.org/stable/2003354.

17

Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Cli ord Stein. Introduction to
Algorithms, Third Edition. MIT Press, 3rd edition, 2009. ISBN 0262033844, 9780262033848.

18

Philippe Coussy and Adam Morawiec. High-level synthesis, volume 1. Springer, 2010.

19

Steve Dai, Ritchie Zhao, Gai Liu, Shreesha Srinath, Udit Gupta, Christopher Batten, and
Zhiru Zhang. Dynamic hazard resolution for pipelining irregular loops in high-level synthesis.
In Proceedings of the International Symposium on Field Programmable Gate Arrays (FPGA),
pages 189{194, 2017. ISBN 978-1-4503-4354-1. doi: 10.1145/3020078.3021754. URL http:
//doi.acm.org/10.1145/3020078.3021754.

20

Je rey Dean and Sanjay Ghemawat. MapReduce: Simpli ed data processing on large clusters.
Communications of the ACM, 51(1):107{113, January 2008. ISSN 0001-0782. doi: 10.1145/
1327452.1327492. URL http://doi.acm.org/10.1145/1327452.1327492.

21

Alvin M. Despain. Fourier transform computers using CORDIC iterations. IEEE Transactions
on Computers, 100(10):993{1001, 1974.

22

J. Detrey and F. de Dinechin. Floating-point trigonometric functions for FPGAs. In Proceed-
ings of the International Field Programmable Logic and Applications Conference (FPL), pages
29{34, August 2007. doi: 10.1109/FPL.2007.4380621.

23

L Peter Deutsch. DEFLATE compressed data format speci cation version 1.3. 1996.

24

Jean Duprat and Jean-Michel Muller. The CORDIC algorithm: new results for fast VLSI
implementation. IEEE Transactions on Computers, 42(2):168{178, 1993.

25

Brian P. Flannery, William H. Press, Saul A. Teukolsky, and William Vetterling. Numerical
recipes in C. Press Syndicate of the University of Cambridge, New York, 1992.

26

Daniel D. Gajski, Nikil D. Dutt, Allen C.H. Wu, and Steve Y.L. Lin. High-Level Synthesis:
Introduction to Chip and System Design. Springer Science & Business Media, 2012.

27

W Morven Gentleman and Gordon Sande. Fast fourier transforms: for fun and pro t. In
Proceedings of the November 7-10, 1966, fall joint computer conference, pages 563{578. ACM,
1966.

28

Nivia George, HyoukJoong Lee, David Novo, Tiark Rompf, Kevin J. Brown, Arvind K. Su-
jeeth, Martin Odersky, Kunle Olukotun, and Paolo Ienne. Hardware system synthesis from
domain-speci c languages. In Proceedings of the International Field Programmable Logic and
Applications Conference (FPL), pages 1{8. IEEE, 2014.

29

Sumit Gupta, Rajesh Gupta, Nikil Dutt, and Alexandru Nicolau. SPARK: A Parallelizing
Approach to the High-level Synthesis of Digital Circuits. Kluwer, 2004. ISBN 1-4020-7837-4.

30

Scott Hauck and Andre DeHon. Recon gurable computing: the theory and practice of FPGA-
based computation, volume 1. Morgan Kaufmann, 2010.

31

James Hegarty, Ross Daly, Zachary DeVito, Jonathan Ragan-Kelley, Mark Horowitz, and
Pat Hanrahan. Rigel: Flexible multi-rate image processing hardware. ACM Trans. Graph.,
35(4):85:1{85:11, July 2016. ISSN 0730-0301. doi: 10.1145/2897824.2925892. URL http:
//doi.acm.org/10.1145/2897824.2925892.

32

M. Heideman, D. Johnson, and C. Burrus. Gauss and the history of the fast fourier transform.
ASSP Magazine, IEEE, 1(4):14{21, October 1984. ISSN 0740-7467. doi: 10.1109/MASSP.
1984.1162257.

33

David A. Hu man. A method for the construction of minimum-redundancy codes. Proceedings
of the IRE, 40(9):1098{1101, 1952.

34

Ryan Kastner, Anup Hosangadi, and Farzan Fallah. Arithmetic optimization techniques for
hardware and software design. Cambridge University Press, 2010.

35

David Knapp. Behavioral Synthesis: Digital System Design using the Synopsys Behavioral
Compiler. Prentice-Hall, 1996. ISBN 0-13-569252-0.

36

Donald Ervin Knuth. The art of computer programming: sorting and searching, volume 3.
Pearson Education, 1998.

37

Charles Eric Laforest, Zimo Li, Tristan O'rourke, Ming G. Liu, and J. Gregory Ste an.
Composing multi-ported memories on FPGAs. ACM Transactions on Recon gurable Tech-
nology and Systems (TRETS), 7(3):16:1{16:23, September 2014. ISSN 1936-7406. doi:
10.1145/2629629. URL http://doi.acm.org/10.1145/2629629.

38

Glen G. Langdon Jr, Joan L. Mitchell, William B. Pennebaker, and Jorma J. Rissanen. Arith-
metic coding encoder and decoder system, February 27 1990. US Patent 4,905,297.

39

Dajung Lee, Janarbek Matai, Brad Weals, and Ryan Kastner. High throughput channel
tracking for JTRS wireless channel emulation. In Proceedings of the International Field Pro-
grammable Logic and Applications Conference (FPL). IEEE, 2014.

40

Edward A. Lee and David G. Messerschmitt. Pipeline interleaved programmable DSPs: Ar-
chitecture. IEEE Transactions on Acoustics, Speech, and Signal Processing (TASSP), 35(9):
1320{1333, September 1987.

41

Edward A. Lee and Pravin Varaiya. Structure and Interpretation of Signals and Systems,
Second Edition. 2011. ISBN 0578077191. URL LeeVaraiya.org.

42

Edward Ashford Lee. Plato and the Nerd: The Creative Partnership of Humans and Technol-
ogy. MIT Press, 2017. ISBN 978-0262036481.

43

C. Leiserson, F. Rose, and J. Saxe. Optimizing synchronous circuitry by retiming. In Third
Caltech Conference On VLSI, 1993.

44

G. Liu, M. Tan, S. Dai, R. Zhao, and Z. Zhang. Architecture and synthesis for area-ecient
pipelining of irregular loop nests. IEEE Transactions on Computer-aided Design of Integrated
Circuits and Systems (TCAD), 36(11):1817{1830, November 2017. ISSN 0278-0070. doi:
10.1109/TCAD.2017.2664067.

45

Rui Marcelino et al. Sorting units for FPGA-based embedded systems. In Distributed Embedded
Systems: Design, Middleware and Resources, pages 11{22. Springer, 2008.

46

Janarbek Matai, Pingfan Meng, Lingjuan Wu, Brad T Weals, and Ryan Kastner. Designing a
hardware in the loop wireless digital channel emulator for software de ned radio. In Proceedings
of the International Conference on Field-Programmable Technology (FPT). IEEE, 2012.

47

Janarbek Matai, Joo-Young Kim, and Ryan Kastner. Energy ecient canonical hu man
encoding. In Proceedings of the International Conference on Application-speci c Systems,
Architectures and Processors (ASAP). IEEE, 2014.

48

Janarbek Matai, Dustin Richmond, Dajung Lee, and Ryan Kastner. Enabling FPGAs for the
masses. arXiv preprint arXiv:1408.5870, 2014.

49

Janarbek Matai, Dustin Richmond, Dajung Lee, Zac Blair, Qiongzhi Wu, Amin Abazari, and
Ryan Kastner. Resolve: Generation of high-performance sorting architectures from high-level
synthesis. In Proceedings of the International Symposium on Field Programmable Gate Arrays
(FPGA), pages 195{204. ACM, 2016. ISBN 978-1-4503-3856-1. doi: 10.1145/2847263.2847268.
URL http://doi.acm.org/10.1145/2847263.2847268.

50

Carver Mead and Lynn Conway. Introduction to VLSI systems, volume 1080. Addison-Wesley
Reading, MA, 1980.

51

Giovanni De Micheli. Synthesis and optimization of digital circuits. McGraw-Hill Higher
Education, 1994.

52

Shahnam Mirzaei, Anup Hosangadi, and Ryan Kastner. FPGA implementation of high speed
r lters using add and shift method. In Computer Design, 2006. ICCD 2006. International
Conference on, pages 308{313. IEEE, 2007.

53

MISRA. Guidelines for the Use of the C Language in Critical Systems. March 2013. ISBN
978-1-906400-10-1. URL https://www.misra.org.uk.

54

Rene Mueller et al. Sorting networks on FPGAs. The VLDB Journal|The International
Journal on Very Large Data Bases, 21(1):1{23, 2012.

55

Jorge Ortiz et al. A streaming high-throughput linear sorter system with contention bu ering.
International Journal of Recon gurable Computing, 2011.

56

Marios C. Papaefthymiou. Understanding retiming through maximum average-weight cycles.
In SPAA '91: Proceedings of the third annual ACM symposium on Parallel algorithms and
architectures, pages 338{348, 1991.

57

William B. Pennebaker. JPEG: Still image data compression standard. Springer, 1992.

58

Robert Sedgewick. Algorithms in C. Addison-Wesley, 2001. ISBN 978-0201756081.

59

Bhaskar Sherigar and Valmiki K. Ramanujan. Hu man decoder used for decoding both ad-
vanced audio coding (AAC) and MP3 audio, June 29 2004. US Patent App. 10/880,695.

60

F. Winterstein, S. Bayliss, and G. A. Constantinides. High-level synthesis of dynamic data
structures: A case study using Vivado HLS. In Proceedings of the International Symposium
on Field Programmable Gate Arrays (FPGA), pages 362{365, December 2013. doi: 10.1109/
FPT.2013.6718388.

61

Ian H. Witten, Radford M. Neal, and John G. Cleary. Arithmetic coding for data compression.
Communications of the ACM, 30(6):520{540, 1987.

62

UltraScale Architecture Con gurable Logic Block (UG574). Xilinx, v1.5 edition,
February 2017. URL https://www.xilinx.com/support/documentation/user_guides/
ug574-ultrascale-clb.pdf.

63

Vivado Design Suite User Guide: High-Level Synthesis (UG902). Xilinx, v2017.1 edition, April
2017. URL https://www.xilinx.com/support/documentation/sw_manuals/xilinx2017_
1/ug902-vivado-high-level-synthesis.pdf.

64

UltraScale Architecture Con guration (UG570). Xilinx, v1.7 edition, March
2017. URL https://www.xilinx.com/support/documentation/user_guides/
ug570-ultrascale-configuration.pdf.